How to stack same rows of prior distributions inside the with model effectively

yanlingzuo · July 9, 2020, 2:32pm

Hi, I’m new to PyMC3, working on fitting a hierarchical Bayes mixed logit model using the yogurt data. Specifically, I have the following code so far

Yogurt_Model = pm.Model()
 with Yogurt_Model:
       mu = pm.MvNormal('mu', mu = mu_mean, cov = mu_cov, shape = len(mu_mean))

       packed_L = pm.LKJCholeskyCov('packed_L', n = 3, eta = 2., sd_dist = pm.HalfCauchy.dist(2.0))
       L = pm.expand_packed_triangular(3,packed_L)
       
       alpha = pm.MvNormal('alpha', mu = mu, chol = L, shape = (4,len(mu_mean)))

Now my alpha matrix has 4 x 3 dimension. After this step, I’d like to expand alpha matrix vertically to repeat first alpha row 5 times, second alpha row 2 times, third alpha row 3 times, and fourth alpha row 3 times. Could anyone show me a quick way of doing that instead of using for loop?

Thank you!

ckrapu · July 9, 2020, 4:03pm

Yes - here’s a modified code block that implements the expansion. I don’t have the variable mu_mean so I hardcoded some values in. The basic idea is that we use the row indexing for alpha to pick off the right values. Note that we can repeatedly use the same index and thus repeat a value more than one time.

import pymc3 as pm
import numpy as np
Yogurt_Model = pm.Model()

reshaping_indices = [0]*5 + [1]*2 + [2]*3 + [3]*3

with Yogurt_Model:
    
    mu = pm.MvNormal('mu', mu=np.zeros(3), cov=np.eye(3),shape=3)

    packed_L = pm.LKJCholeskyCov('packed_L', n = 3, eta = 2., sd_dist = pm.HalfCauchy.dist(2.0))
    L = pm.expand_packed_triangular(3,packed_L)

    alpha = pm.MvNormal('alpha', mu = mu, chol = L, shape = (4,3))
    
    alpha_reshaped = pm.Deterministic('alpha_reshaped',alpha[reshaping_indices])
    trace = pm.sample()
   
# Print off the posterior means
print(trace['alpha'].mean(axis=0))
print(trace['alpha_reshaped'].mean(axis=0))

yanlingzuo · July 9, 2020, 6:39pm

Thank you very much! @ckrapu Much simpler than my naive for loop.

yanlingzuo · July 9, 2020, 9:41pm

@ckrapu sorry for seeking your advice again. I just don’t have a good handle on how to manipulate things inside the with part. A quick way to append zero column to the last column of the alpha_reshaped matrix? Thank you very much! Any good reference for me to study would be greatly appreciated as well. Thank you!

yanlingzuo · July 9, 2020, 10:19pm

@ckrapu tried this. Seems work where tt for import theano.tensor as tt

alpha_reshaped = pm.Deterministic(‘alpha_reshaped’, alpha[choices_index])
zero_vec = np.zeros((len(choices_index),1))
test = tt.concatenate([alpha_reshaped,zero_vec], axis = 1)

Looks like a good way of doing it? Thank you again

ckrapu · July 9, 2020, 10:20pm

Yup - that’s exactly right!

yanlingzuo · July 9, 2020, 10:20pm

@ckrapu Thank you so much! learned a lot from you.

yanlingzuo · July 10, 2020, 4:35pm

@ckrapu Hi Chris, sorry to bother you again. The code I have below uses for loop to dot product of every two columns with beta vector, totally 8 columns in the matrix. I wonder whether this is better way to do this. This is inside the with part. Thanks a lot.

   # Define utility for fixed effects
    utility_fixed = tt.dot(X_fixed[:,[0,1]], beta)
    utility_fixed = tt.reshape(utility_fixed, (69,1))
           
    for i in range(3):
        part = X_fixed[:, [2 + i, 3 + i]]
        dot_prod = tt.dot(part, beta)
        dot_prod = tt.reshape(dot_prod, (69,1))
        utility_fixed = tt.concatenate([utility_fixed, dot_prod], axis = 1)

ckrapu · July 10, 2020, 5:08pm

No worries - I am glad to help! Here’s a snippet that I think illustrates how to solve your problem. I recommend you play around with values of beta and inspect the resulting numeric values to make sure that it does what you want. The main thing I do here is reshaping X_fixed to get an extra dimension that I can use for broadcasting.

import theano.tensor as tt
import numpy as np

d1 = 10 # Number of rows in X_fixed
d2 = 6  # Number of columns in X_fixed. This must be an even number.
n_pairs_cols = int(d2/2) # Number of pairs of columns.

# Create toy data
X_fixed = np.arange(d1*d2).reshape([d1,d2],order='F')
beta = np.ones([2])

# Add an extra axis corresponding to distinct pairs of columns
X_reshaped = tt.reshape(X_fixed,(d1,n_pairs_cols,2))

# apply elementwise multiplication and sum over the 
# last axis to aggregate by pairs
dot_prod = (X_reshaped * beta).sum(axis=-1)
print(dot_prod.shape.eval())
print(dot_prod.eval())

yanlingzuo · July 10, 2020, 5:10pm

@ckrapu Thank you very much! I’ll study your snippet carefully and modify my code. Thank you!

yanlingzuo · July 10, 2020, 6:09pm

@ckrapu your code is so powerful!!! Thank you very much!

AlexAndorra · July 17, 2020, 10:26am

Also, note that in PyMC >= 3.9.0, you can access the expanded Cholesky factor automatically:

L, Rho, sigmas = pm.LKJCholeskyCov(
    'chol_cov', n=3, eta=2., sd_dist=pm.HalfCauchy.dist(2.0), 
    compute_corr=True
)

yanlingzuo · July 17, 2020, 2:27pm

@AlexAndorra Thank you very much for the info!

Topic		Replies	Views
Problem with pymc3.Multinomial() distribution? Questions	3	503	July 14, 2020
Bayesian multilevel model indexing problems Questions	1	441	June 4, 2021
Managing indices/data shape in a multi-level linear regression for incremental model updating Questions	0	333	September 10, 2020
Modelling/transferring a BUGS example to PyMC3 Questions	1	465	April 1, 2021
Confused about building a compound distribution hierarchical Bayes model with continuous and categorical inputs in pymc3 Questions	1	728	June 8, 2019

How to stack same rows of prior distributions inside the with model effectively

Related Topics