Multinomial distribution doesn't behave as expected

I don’t think that the multinomial distribution is behaving as expected. I was trying to model the birthday paradox as a multinomial with p=[1/365]*365 and n=23, which works just fine in numpy; yet I have not been able to repro this in pymc3 - I get all arrays of size 365 with [23 … 0] (which is highly improbable).

import pymc3 as pm
import numpy as np
with pm.Model() as model:
    pm.Multinomial('k', n=23, p=np.array([1/365]*365), shape=(365,))
    trace = pm.sample()
    
print(
    np.random.multinomial(n=23, pvals=[1/365]*365),
    trace['k'][0]
)

Most likely the metropolis sampler is stuck as you have high dimensional variable.
If you want to sample from prior you should use pm.sample_prior_predictive

Amazing, I was looking exactly for sample_prior_predictive. Leaving my birthday paradox code here in exchange for this amazing piece of information and for reference :slight_smile:

import numpy as np
import pandas as pd

with pm.Model() as model:
    n = pm.DiscreteUniform('n', 1, 150)
    birthday_frequencies = pm.Multinomial(
        'k',
        n=n,
        p=[1/365]*365, 
        shape=(365)
    )
    colliding_birthdays = pm.Deterministic(
        'colliding_birthdays', 
        birthday_frequencies.max() > 1
    )
    trace = pm.sample_prior_predictive(samples=50000)

df = pd.DataFrame([trace['n'], trace['colliding_birthdays'].astype(int)]).T
df.columns = ['Number of people', 'P(colliding birthday)']
df.groupby('Number of people').mean().plot()
plt.grid()
1 Like