Help for Theano calculation

Hi All,

I came up with a problem yesterday and thanks for chartl’s help that I could now figure out what is the real reason. But still I cannot resolve it. The problem is:

I want to use MCMC to sample data for several parameters. The likelihood function kind of following this format:

Likelihood \text{~} N(\mu,\sigma)
where \mu = (\sum_{i=1}{X_i['a'] * \lambda^{X_i['b']}})^\gamma, X is the data; a, b are column names, \lambda is the first parameter and \gamma is the second one.

The difficulty is that:

  1. After calculating everything within the brackets (\sum_{i=1}{X_i['a'] * \lambda^{X_i['b']}}), we will use \lambda's information and it results in a theano tensor operation Elmentwise{mul,inplace} for each row.
  2. Here sum is a group sum (which means we need to group by first), but we cannot use pandas group by cause previous step results in theano tensors.

The original problem is here:


I’m having trouble understanding something. Should \mu be an array with one entry per group?

If yes, you can do the grouping by hand using unique. You just need to add return_inverse=True to get the indexes of each group’s elements in X.

Hi Luciano,

Thanks for your reply! The thing is that I will need to group by more than one column, but unique is only for 1-D array. After these days learning of theano and pymc3, here is my main code now:

with pm.Model() as model:
    ### prior for unknown parameters
    # prior for lambda
    BoundedNormal_lam = pm.Bound(pm.Normal, lower=lam_lbd, upper=lam_ubd)
    lam = BoundedNormal_lam('lam', mu=lam_mean, sd=lam_sd, shape = 8)
    # prior for gamma
    BoundedNormal_gamma = pm.Bound(pm.Normal, lower=0.2, upper=0.8)
    gamma = BoundedNormal_gamma('gamma', mu=0.5, sd=0.1)
    # prior for beta
    BoundedNormal_beta = pm.Bound(pm.Normal, lower=b_lbd, upper=b_ubd)
    beta = BoundedNormal_beta('beta', mu=b_mean, sd=b_sd, shape = 8)
    # prior for v
    v = pm.Uniform('v', lower = 1e-2, upper =1e2)
    # decayed promotion: pde * lam_j^{t-l}
    xxdat['ad'] = (xxdat['pde_norm'].values) * (lam[xxdat['chnl_cd'].values-1]**(xxdat['gap_nrx_promo'].values))
    # Sum up of decayed promotion for each NRx + each channel: \sum_t pde * lam_j^{t-l}
    yydat = xxdat.groupby(['spcl_st_cd','nrx_wk', 'chnl_cd', 'nrx_norm'], as_index= False).agg({'ad':test_sum})
    yydat.rename(columns = {'ad': 'ad_sum'}, inplace = True)
    yydat['ad_sum_r'] = (yydat['ad_sum'] ** gamma) # Add adstock diminishing factor (\sum_t pde * lam_j^{t-l})^r
    # normalize each channel's adstock
    yydat['ad_norm'] = (yydat['ad_sum_r'].values)/(yydat.groupby('chnl_cd', as_index=False).agg({'ad_sum_r':test_mean})['ad_sum_r'][yydat['chnl_cd'].values-1]).values
    # sum up adstock multiplying impact factor over channels
    yydat['bx'] = (beta[yydat['chnl_cd'].values -1]*(yydat['ad_norm']))
    zzdat = yydat.groupby(['spcl_st_cd', 'nrx_wk', 'nrx_norm'], as_index = False).agg({'bx':test_sum})
    # Get the final mu
    mu = test_mean(zzdat['bx'])
    ### likelihood (sampling distribution) of observations
    Y_obs = pm.Normal('Y_obs', mu=mu, sd = v, observed = zzdat['nrx_norm'])

And here are these two test functions:

def test_sum(x):
    return tt.sum(x.to_list())

def test_mean(x):
    return tt.mean(x.to_list())

It can run successfully with row number < 256. But once it is larger than it, i got this error:

fatal error: bracket nesting level exceeded maximum of 256. 

I have tried changed the maximum number to some larger number but it give some other errors. So i think still i need to revise my code. Do you have any idea about my case? Thank you!

unique can work on multidimensional arrays using the axis argument. It you want to find the unique rows, you just use axis=0.

Don’t use to_list. Lists can hold anything inside of them. When you do a sum on a list, theano seems to write down weird c code, using many nested {sum something's}, and that raises the error you see. Can you use values instead? I imagine that you can’t because there are symbolic tensors in the middle. My recommendation is to stop using pandas groupby and switch completely to unique

1 Like

Hi Luciano,

Thank you so much for the reply. Your guessing is correct, the nested aggregations caused this issue and also I cannot uses .values instead of to_list. So I have decided to change all pandas operations and calculation to numpy but still i faced some problems. Let’s only focus on the first two rows of codes (old code):

xxdat['ad'] = (xxdat['pde_norm'].values) * (lam[xxdat['chnl_cd'].values-1]**(xxdat['gap_nrx_promo'].values))
yydat = xxdat.groupby(['spcl_st_cd','nrx_wk', 'chnl_cd', 'nrx_norm'], as_index= False).agg({'ad':test_sum})

For the first row, the problem is that lam is a symbolic tensor, so even though we have for example 100 rows, after the multiplication it will still result in only one value: Elemwise{mul,no_inplace}.0. But behind the scene it has the shape (100,1). So I decided to do this calculation row by row with below code and also create groupby function based on your suggestions using unique:

# decayed promotion: pde * lam_j^{t-l}
ad = np.apply_along_axis(caclculate_ad, 1, xxdat_matrix)
xxdat_matrix = np.append(xxdat_matrix,ad.reshape(-1,1),axis = 1)
yydat_matrix = np.empty((len(GroupId1), 5), dtype = 'object')
for index, group in enumerate(GroupId1):
    sub_group = xxdat_matrix[GroupId1 == group]
    sum_ad = tt.sum(sub_group[:,-1].tolist())
    sum_ad_gammma = sum_ad**gamma
    yydat_matrix[index,:] = np.append(sub_group[:,[0,1,2,3]][0].reshape(1,-1), np.array([sum_ad_gammma]).reshape(1,-1), axis = 1)
bx =  np.apply_along_axis(calculate_bx, 1, yydat_matrix)
mu = tt.mean(bx.tolist())

# Helper function 
def caclculate_ad(x):
    '''x is a numpy array contains all the basic information'''
    pde_norm = x[5]
    channel = int(x[3])-1
    return pde_norm*((lam[channel])**(x[6]))

Now within pymc.model context it is working fine. But the issues is, when i try to sample from it, lets say using find_MAP, i got this error:

Exception: ('The following error happened while compiling the node', Elemwise{Composite{(Switch(Cast{int8}((GE(i0, i1) * LE(i0, i2))), (i3 * i4 * (i5 - i4) * (i6 + i7)), i8) + (i9 * (Switch(EQ(i10, i8), i8, (i11 * log(i10) * i12)) + Switch(EQ(i13, i8), i8, (i14 * log(i13) * i15)) 

It is a very long error and not intuitive at all to me… any idea on this?

Thanks so much for your time!


Forgot to add, groupID was pre-calculated with the following code:

### Get the group ID 
GroupId1 = np.unique(xxdat_matrix[:,[0,1,2,3]], axis = 0, return_inverse=True)[1]

Hi @fgong

This is almost verbatim your post here Using Pandas within Pymc3 context?. Rather than duplicating the content on this thread, would you mind simply linking from one topic to the other?


1 Like

Don’t mind Chris. My initial idea was to ask different topic questions. Sorry to make it almost a duplicate.

Also thank u so much Luciano! without your help I would not be able to solve this problem!