Problem with pymc3.Multinomial() distribution?

yanlingzuo · July 13, 2020, 3:17pm

Dear PyMC3 developers:
Not sure what’s going on with pm.Multinomial() distribution, or it is my coding mistake as I’m new to PyMC3. I was trying to fit a Bayesian Hierarchical Model with the Yogurt data - starting with 2 households to 5 households. As the number of households increase from 2 to 5, the draw values from my multivariate alpha random variable getting smaller and smaller. Here alpha dimension is (number of households x 3. I copied the data in the code below up to 5 households for testing. Each response y from a household contains 4 values (4 choices).There are multiple rows of y for each household. Alpha represents the choice preferences from different households, one column less than y’s column because that level serves as reference in coding. Here is the specific code for 5 households case.

import theano.tensor as tt
import pymc3 as pm
import numpy as np

num_house = 5
np.random.seed(123)
mu_mean = np.zeros(3)
mu_cov = np.array([[625, 200, 200],[200,625,200], [200, 200, 625]])

# choices per household index
choices_index = [0]*8 + [1]*20 + [2]*15 +[3]*14 + [4]*14 

# Get y data = (71 x 4) for 5 households
Yoplait = [0]*8 + [1]*4 + [0]*19 +[1]*8 + [0]*13 + [1]*3 + [0]*2 + [1]*3 + [0]*3 + [1]*2 + [0]*5 + [1] 
Yoplait = np.array(Yoplait).reshape(71,1)
Dannon = [0] + [1]*6 + [0]*21 + [1]*3 + [0]*9 + [1]*3 + [0]*2 + [1]*2 + [0]*3 + [1]*2 +[0]*4 + [1]+[0]*3+[1]*3+[0]*2+[1]*5+[0] 
Dannon = np.array(Dannon).reshape(71,1)
Weight = [1] + [0]*6 + [1] + [0]*4 + [1]*16 + [0]*11 + [1] + [0]*15 + [1] + [0]*15
Weight = np.array(Weight).reshape(71,1)
Hiland = [0]*43 + [1]*2 + [0]*2 + [1]*3 + [0]*21
Hiland = np.array(Hiland).reshape(71,1)
y = np.concatenate((Yoplait,Dannon,Weight,Hiland),axis = 1)

if __name__ == '__main__':
    Yogurt_Model = pm.Model()
    with Yogurt_Model:
        
        # Specify prior distribution for population mean of alpha - utility parameters for household
        mu = pm.MvNormal('mu', mu = mu_mean, cov = mu_cov, shape = len(mu_mean))
        
        # prior distribution for covariance of alpha 
        packed_L = pm.LKJCholeskyCov('packed_L', n = 3, eta = 2., sd_dist = pm.HalfCauchy.dist(2.5))
        L = pm.expand_packed_triangular(3,packed_L)
        
        # prior distribution for alpha
        alpha = pm.MvNormal('alpha', mu = mu, chol = L, shape = (num_house,len(mu_mean)))
        
        # Reshape alpha to have repeated alpha rows for each household
        alpha_reshaped = pm.Deterministic('alpha_reshaped', alpha[choices_index])
        
        # add the zero column for the reference level to alpha_reshaped
        zero_vec = np.zeros((len(choices_index),1))
        alpha_reshaped = tt.concatenate([alpha_reshaped,zero_vec], axis = 1)
                                                       
        # use softmax function to calculate p used in multinomial distribution 
        p = pm.Deterministic('p', tt.nnet.softmax(alpha_reshaped))
      
        #Likelihood
        Like = pm.Multinomial('Like', n= 1, p = p, observed = y)
        step = pm.Metropolis()   
        trace = pm.sample(5000, tune = 1000, step = step)  
        
        print(trace['alpha'][2000:5000].mean(axis=0))

The alpha means I got for 5 household are

[[ 1.79400609e-08 -4.60850162e-08 6.50863713e-09]
[ 1.77725171e-08 -1.05121019e-08 2.82981168e-09]
[ 1.81173138e-08 1.78398032e-08 2.53601684e-08]
[ 1.76738471e-08 1.70500670e-08 2.45552148e-08]
[ 1.78436486e-08 -3.60380436e-08 1.99448907e-08]]

Modify a few dimensions in data, rerun the same code with 4, 3, and 2 households, I got the gradually increased alpha means

For 4 households,
[[-0.01718291 -0.00599668 0.03564388]
[-0.01047885 -0.00710468 0.03176734]
[ 0.00345223 -0.00845897 0.02691133]
[-0.0044705 -0.00827159 -0.0268496 ]]

For 3 households,
[[0.90416004 3.72164376 2.18054261]
[2.44950274 0.98309626 4.14634506]
[2.55693314 2.47482759 0.67979624]]

For 2 households,
[[ 2.69389092 6.53143594 5.29746544]
[ 4.18144832 -3.51906643 5.71260768]]

The entire yogurt data has 100 households. If I run the entire data, then alpha values are zeros.
Thank you very much for shedding a light on what’s going on with my example! Truly appreciate your help!

junpenglao · July 14, 2020, 6:58am

Likely Metropolis is being very inefficient exploring the high dimensional space. Also, since you have n=1, you can use Categorical instead of Multinomial. Small rewrite of the model that seems to work:

with pm.Model() as Yogurt_Model:

    # Specify prior distribution for population mean of alpha - utility parameters for household
    mu = pm.MvNormal('mu', mu=mu_mean, cov=mu_cov, shape=len(mu_mean))

    # prior distribution for covariance of alpha
    packed_L = pm.LKJCholeskyCov(
        'packed_L', n=3, eta=2., sd_dist=pm.HalfNormal.dist(2.5))
    L = pm.expand_packed_triangular(3, packed_L)

    # prior distribution for alpha using noncenter parameterization
    alpha_ = pm.Normal('alpha_', 0., 1., shape=(num_house, len(mu_mean)))
    alpha = alpha_@L + mu
    
    # Reshape alpha to have repeated alpha rows for each household
    alpha_reshaped = pm.Deterministic('alpha_reshaped', alpha[choices_index])

    # add the zero column for the reference level to alpha_reshaped
    zero_vec = np.zeros((len(choices_index), 1))
    alpha_reshaped = tt.concatenate([alpha_reshaped, zero_vec], axis=1)

    # use softmax function to calculate p used in multinomial distribution
    p = pm.Deterministic('p', tt.nnet.softmax(alpha_reshaped))

    # Likelihood
    Like = pm.Categorical('Like', p=p, observed=np.argmax(y, axis=1))
    trace = pm.sample(1000, tune=1000, return_inferencedata=True)

print(trace.posterior['alpha'].mean(axis=1))

<xarray.DataArray 'alpha' (chain: 2, alpha_dim_0: 5, alpha_dim_1: 3)>
array([[[ 0.66873137,  4.33626143,  2.8985855 ],
        [ 3.31531368,  0.12725865,  4.872073  ],
        [ 3.68108074,  3.36640514,  1.36393261],
        [-0.08666305,  0.45458892, -1.08253757],
        [ 4.12301146,  4.43430025,  0.21736631]],

       [[ 0.81179725,  4.16839674,  2.71980007],
        [ 3.73765871,  0.70367868,  5.21265732],
        [ 3.83967391,  3.54716772,  1.53611838],
        [-0.09483954,  0.45383569, -1.14647242],
        [ 4.03947381,  4.38736543,  0.02006402]]])
Coordinates:
  * chain        (chain) int64 0 1
  * alpha_dim_0  (alpha_dim_0) int64 0 1 2 3 4
  * alpha_dim_1  (alpha_dim_1) int64 0 1 2

yanlingzuo · July 14, 2020, 1:19pm

@junpenglao Thank you very much for suggesting using Categorical! Will give it try for the entire 100 household case. Thank you!

yanlingzuo · July 14, 2020, 6:28pm

@junpenglao Thank you very much! Your code works for the entire data set which contains 100 households – 100 3-component alpha distributions, each alpha representing an individual household. I also added 2 fixed coefficients (using classical statistics terminology) – coefficient distribution with 2 components - a mixed logit model. Works well. Thanks again!

Topic		Replies	Views
Hierarchical multinomial model Questions	3	814	April 2, 2021
Multinomial distribution doesn't behave as expected Questions	2	586	March 28, 2019
Implementing the DirichletMultinomial distribution in PyMC3 Development development	42	1805	January 18, 2021
Using Multinomial with an underlying model for the probabilities Questions	5	669	October 23, 2019
How to stack same rows of prior distributions inside the with model effectively Questions	12	710	July 17, 2020

Problem with pymc3.Multinomial() distribution?

The alpha means I got for 5 household are

Modify a few dimensions in data, rerun the same code with 4, 3, and 2 households, I got the gradually increased alpha means

Related topics