Dear PyMC3 developers:
Not sure what’s going on with pm.Multinomial() distribution, or it is my coding mistake as I’m new to PyMC3. I was trying to fit a Bayesian Hierarchical Model with the Yogurt data - starting with 2 households to 5 households. As the number of households increase from 2 to 5, the draw values from my multivariate alpha random variable getting smaller and smaller. Here alpha dimension is (number of households x 3. I copied the data in the code below up to 5 households for testing. Each response y from a household contains 4 values (4 choices).There are multiple rows of y for each household. Alpha represents the choice preferences from different households, one column less than y’s column because that level serves as reference in coding. Here is the specific code for 5 households case.
import theano.tensor as tt
import pymc3 as pm
import numpy as np
num_house = 5
np.random.seed(123)
mu_mean = np.zeros(3)
mu_cov = np.array([[625, 200, 200],[200,625,200], [200, 200, 625]])
# choices per household index
choices_index = [0]*8 + [1]*20 + [2]*15 +[3]*14 + [4]*14
# Get y data = (71 x 4) for 5 households
Yoplait = [0]*8 + [1]*4 + [0]*19 +[1]*8 + [0]*13 + [1]*3 + [0]*2 + [1]*3 + [0]*3 + [1]*2 + [0]*5 + [1]
Yoplait = np.array(Yoplait).reshape(71,1)
Dannon = [0] + [1]*6 + [0]*21 + [1]*3 + [0]*9 + [1]*3 + [0]*2 + [1]*2 + [0]*3 + [1]*2 +[0]*4 + [1]+[0]*3+[1]*3+[0]*2+[1]*5+[0]
Dannon = np.array(Dannon).reshape(71,1)
Weight = [1] + [0]*6 + [1] + [0]*4 + [1]*16 + [0]*11 + [1] + [0]*15 + [1] + [0]*15
Weight = np.array(Weight).reshape(71,1)
Hiland = [0]*43 + [1]*2 + [0]*2 + [1]*3 + [0]*21
Hiland = np.array(Hiland).reshape(71,1)
y = np.concatenate((Yoplait,Dannon,Weight,Hiland),axis = 1)
if __name__ == '__main__':
Yogurt_Model = pm.Model()
with Yogurt_Model:
# Specify prior distribution for population mean of alpha - utility parameters for household
mu = pm.MvNormal('mu', mu = mu_mean, cov = mu_cov, shape = len(mu_mean))
# prior distribution for covariance of alpha
packed_L = pm.LKJCholeskyCov('packed_L', n = 3, eta = 2., sd_dist = pm.HalfCauchy.dist(2.5))
L = pm.expand_packed_triangular(3,packed_L)
# prior distribution for alpha
alpha = pm.MvNormal('alpha', mu = mu, chol = L, shape = (num_house,len(mu_mean)))
# Reshape alpha to have repeated alpha rows for each household
alpha_reshaped = pm.Deterministic('alpha_reshaped', alpha[choices_index])
# add the zero column for the reference level to alpha_reshaped
zero_vec = np.zeros((len(choices_index),1))
alpha_reshaped = tt.concatenate([alpha_reshaped,zero_vec], axis = 1)
# use softmax function to calculate p used in multinomial distribution
p = pm.Deterministic('p', tt.nnet.softmax(alpha_reshaped))
#Likelihood
Like = pm.Multinomial('Like', n= 1, p = p, observed = y)
step = pm.Metropolis()
trace = pm.sample(5000, tune = 1000, step = step)
print(trace['alpha'][2000:5000].mean(axis=0))
The alpha means I got for 5 household are
[[ 1.79400609e-08 -4.60850162e-08 6.50863713e-09]
[ 1.77725171e-08 -1.05121019e-08 2.82981168e-09]
[ 1.81173138e-08 1.78398032e-08 2.53601684e-08]
[ 1.76738471e-08 1.70500670e-08 2.45552148e-08]
[ 1.78436486e-08 -3.60380436e-08 1.99448907e-08]]
Modify a few dimensions in data, rerun the same code with 4, 3, and 2 households, I got the gradually increased alpha means
For 4 households,
[[-0.01718291 -0.00599668 0.03564388]
[-0.01047885 -0.00710468 0.03176734]
[ 0.00345223 -0.00845897 0.02691133]
[-0.0044705 -0.00827159 -0.0268496 ]]
For 3 households,
[[0.90416004 3.72164376 2.18054261]
[2.44950274 0.98309626 4.14634506]
[2.55693314 2.47482759 0.67979624]]
For 2 households,
[[ 2.69389092 6.53143594 5.29746544]
[ 4.18144832 -3.51906643 5.71260768]]
The entire yogurt data has 100 households. If I run the entire data, then alpha values are zeros.
Thank you very much for shedding a light on what’s going on with my example! Truly appreciate your help!