Question about Discrete Mixture

ricardoV94 · December 23, 2019, 11:54am

I am trying to emulate the dice problem from Allen Downer’s Think Bayes book in PyMC. The idea is that you have dies of different size (4,6,8,12,20) and you pick one at random and roll it. The question is what can you infer about the picked die given an outcome (e.g., observing a 6).

I tried to model this using a Mixture of Discrete Uniform variables, having a Dirichlet prior for the weights. I would expect that the posterior weight for the first die would be zero, but this is not the case. This model is probably not appropriate for the problem, but I am not sure why. Any ideas?

dice_sizes = np.array((4, 6, 8, 12, 20))

with pm.Model() as m:
    dice = pm.DiscreteUniform.dist(lower=[1], upper=dice_sizes)
   
    weights = pm.Dirichlet('weights', a=np.ones(len(dice_sizes)))
    roll = pm.Mixture('roll', comp_dists=dice, w=weights, observed=6, dtype='int64')

    trace_m = pm.sample(1000)

.

trace_m['weights'].mean(0)
-> array([0.16973499, 0.23528162, 0.21347763, 0.18884671, 0.19265905])

Thank you in advance!

ricardoV94 · January 30, 2020, 11:41am

Bump.

junpenglao · February 2, 2020, 1:57pm

Interesting, the problem here is that in Mixture class, the comp_dist is outputting a log_prob of -inf for dice 1 correctly, but the mixture log_prob use a pm.math.logsumexp that making none zero weight still valid (i.e., does not get rejected).

One of the solution is to replicate the generation process and use only discrete variables:

import numpy as np
import pymc3 as pm
import theano

dice_sizes = np.array((4, 6, 8, 12, 20))
p_dice = np.ones(len(dice_sizes)) / len(dice_sizes)
p_roll = []
for i in dice_sizes:
    p_temp = np.ones(max(dice_sizes)) / i
    p_temp[i:] = 0.
    p_roll.append(p_temp)
p_roll = theano.shared(np.asarray(p_roll))

observed = 6
with pm.Model() as m:
    dice = pm.Categorical('dice', p=p_dice, testval=4)
    roll = pm.Categorical('roll', p=p_roll[dice], observed=observed-1)
    trace_m = pm.sample(10000)

np.histogram(trace_m['dice'], np.arange(len(dice_sizes)+1), density=True)
# ==> (array([0.     , 0.38515, 0.2955 , 0.2026 , 0.11675]),
#      array([0, 1, 2, 3, 4, 5]))

which seems to work well also for the follow up question in the book by replace observed with observed = np.array([6, 8, 7, 7, 5, 4]).

ricardoV94 · February 2, 2020, 6:21pm

Thanks for looking it up. I ended up doing something similar to your suggestion:

dice_sizes = np.array((4, 6, 8, 12, 20))

with pm.Model() as m_3_1:
        obs = pm.Data('data', [6])
        dice = pm.Categorical('dice', p=np.ones(len(dice_sizes)), testval=2)
        dice_size = theano.shared(dice_sizes)[dice]
        roll = pm.DiscreteUniform('roll', lower=1, upper=dice_size, observed=obs)

I just feel it would have been more clean to use the Mixture object. Is this a bug or should it be considered standard behavior?

junpenglao · February 2, 2020, 8:26pm

Well you can use a mixture with a Dirichlet prior that concentrated the weight to 0 or 1:

observed = 6
valid_dice = observed <= dice_sizes
weights_prior = np.ones(sum(valid_dice))/10.

with pm.Model() as m:
    dice = pm.DiscreteUniform.dist(lower=[1], upper=dice_sizes[valid_dice])
    weights = pm.Dirichlet('weights', a=weights_prior)
    roll = pm.Mixture('roll', w=weights, comp_dists=dice, observed=observed)

    trace_m = pm.sample(1000, tune=10000)

Topic		Replies	Views
Dungeons & Dragons Dice problem (Allen Downey) Questions	1	544	August 5, 2020
Using Multinomial with an underlying model for the probabilities Questions	5	669	October 23, 2019
Discrete DensityDist v5 modeling	5	503	January 7, 2023
Struggling with Beta mixture models Questions	3	1821	February 2, 2019
Mixture of Linear models Questions	3	1191	July 21, 2018

Question about Discrete Mixture

Related topics