I have created a model of bags of skittles largely based on the documentation for the Dirichlet Multinomial:
seeds = [1616, 3232, 6464, 1985]
colours = ["red", "orange", "yellow", "green", "purple"]
bags = ["skittles"]
coords = {"colour": colours, "bag": bags}
k = len(colours)
n = len(bags)
observed_by_size = {}
simulated_observed = [[int(3 + np.random.rand()*14) for i in range(5)] for j in range(1000)]
for sample in simulated_observed:
n = sum(sample)
if n in observed_by_size:
observed_by_size[n].append(sample)
else:
observed_by_size[n] = [sample]
skittles_model = pm.Model(coords=coords)
with skittles_model:
num_skittles_prior = pm.Gamma("num_skittles_prior", alpha=45, beta=1)
num_skittles = pm.Poisson("num_skittles", mu=num_skittles_prior)
frac = pm.Dirichlet("frac", a=np.ones(k), dims="colour")
conc = pm.Lognormal("conc", mu=1, sigma=1)
bag_of_skittles = pm.DirichletMultinomial(
"bag_of_skittles",
n=num_skittles,
a=frac * conc,
observed=observed_by_size.get(num_skittles, None),
dims=("bag", "colour")
)
pm.model_to_graphviz(skittles_model)
But when I sample from this model:
with skittles_model:
idata = pm.sample(draws=1000, cores=4, random_seed=seeds)
I do not get the expected result in two ways. First, the num_skittles variable never varies. It stays stuck on 44. Second, all four chains produce identical results despite passing in four different numbers for the random seed. When I sample from just the prior it behaves as expected.
Can anyone explain to me where I’ve gone wrong here?