I don’t think that the multinomial distribution is behaving as expected. I was trying to model the birthday paradox as a multinomial with p=[1/365]*365 and n=23, which works just fine in numpy; yet I have not been able to repro this in pymc3 - I get all arrays of size 365 with [23 … 0] (which is highly improbable).
import pymc3 as pm
import numpy as np
with pm.Model() as model:
pm.Multinomial('k', n=23, p=np.array([1/365]*365), shape=(365,))
trace = pm.sample()
print(
np.random.multinomial(n=23, pvals=[1/365]*365),
trace['k'][0]
)
Most likely the metropolis sampler is stuck as you have high dimensional variable.
If you want to sample from prior you should use pm.sample_prior_predictive
Amazing, I was looking exactly for sample_prior_predictive
. Leaving my birthday paradox code here in exchange for this amazing piece of information and for reference
import numpy as np
import pandas as pd
with pm.Model() as model:
n = pm.DiscreteUniform('n', 1, 150)
birthday_frequencies = pm.Multinomial(
'k',
n=n,
p=[1/365]*365,
shape=(365)
)
colliding_birthdays = pm.Deterministic(
'colliding_birthdays',
birthday_frequencies.max() > 1
)
trace = pm.sample_prior_predictive(samples=50000)
df = pd.DataFrame([trace['n'], trace['colliding_birthdays'].astype(int)]).T
df.columns = ['Number of people', 'P(colliding birthday)']
df.groupby('Number of people').mean().plot()
plt.grid()
1 Like