Hi Chris,
Thanks for your suggestion! I have decided to change all pandas operations and calculation to numpy but still i faced some problems. Here are the old code:
xxdat['ad'] = (xxdat['pde_norm'].values) * (lam[xxdat['chnl_cd'].values-1]**(xxdat['gap_nrx_promo'].values))
yydat = xxdat.groupby(['spcl_st_cd','nrx_wk', 'chnl_cd', 'nrx_norm'], as_index= False).agg({'ad':test_sum})
and for the first row, the problem is that lam
is a symbolic tensor, so even though we have for example 100 rows, after the multiplication it will still result in only one value: Elemwise{mul,no_inplace}.0
. But behind the scene it has the shape (100,1)
. So I decided to do this calculation row by row with below code and also create groupby function using unique
:
basic_model = pm.Model()
with basic_model:
### prior for unknown parameters
# prior for lambda
BoundedNormal_lam = pm.Bound(pm.Normal, lower=lam_lbd, upper=lam_ubd)
lam = BoundedNormal_lam('lam', mu=lam_mean, sd=lam_sd, shape = 8)
# prior for gamma
BoundedNormal_gamma = pm.Bound(pm.Normal, lower=0.2, upper=0.8)
gamma = BoundedNormal_gamma('gamma', mu=0.5, sd=0.1)
# prior for beta
BoundedNormal_beta = pm.Bound(pm.Normal, lower=b_lbd, upper=b_ubd)
beta = BoundedNormal_beta('beta', mu=b_mean, sd=b_sd, shape = 8)
# prior for v
v = pm.Uniform('v', lower = 1e-2, upper =1e2)
# decayed promotion: pde * lam_j^{t-l}
ad = np.apply_along_axis(caclculate_ad, 1, xxdat_matrix)
xxdat_matrix = np.append(xxdat_matrix,ad.reshape(-1,1),axis = 1)
yydat_matrix = np.empty((len(GroupId1), 5), dtype = 'object')
for index, group in enumerate(GroupId1):
sub_group = xxdat_matrix[GroupId1 == group]
sum_ad = tt.sum(sub_group[:,-1].tolist())
sum_ad_gammma = sum_ad**gamma
yydat_matrix[index,:] = np.append(sub_group[:,[0,1,2,3]][0].reshape(1,-1), np.array([sum_ad_gammma]).reshape(1,-1), axis = 1)
bx = np.apply_along_axis(calculate_bx, 1, yydat_matrix)
mu = tt.mean(bx.tolist())
### likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu=mu, sd = v, observed = yydat_matrix[:,2])
where some pre-calculated staff and functions are as below:
### Transfer pandas data frame into numpy matrix
xxdat_matrix = xxdat.values
### Get the group ID
GroupId1 = np.unique(xxdat_matrix[:,[0,1,2,3]], axis = 0, return_inverse=True)[1]
def caclculate_ad(x):
'''x is a numpy array contains all the basic information'''
pde_norm = x[5]
channel = int(x[3])-1
return pde_norm*((lam[channel])**(x[6]))
def calculate_bx(x):
'''x is a numpy array contains ad_sum_r information'''
channel = int(x[3])-1
ad_sum_r = x[4]
return beta[channel]*ad_sum_r
Now within pymc3.basic_model
context it is working fine. But the issues is, when i try to sample from it, lets say using find_MAP
, i got this error:
Exception: ('The following error happened while compiling the node', Elemwise{Composite{(Switch(Cast{int8}((GE(i0, i1) * LE(i0, i2))), (i3 * i4 * (i5 - i4) * (i6 + i7)), i8) + (i9 * (Switch(EQ(i10, i8), i8, (i11 * log(i10) * i12)) + Switch(EQ(i13, i8), i8, (i14 * log(i13) * i15))
It is a very long error and not intuitive at all to me… any idea on this?
Thanks so much for your time!
Best,
Fan