Beta-Binomial conjugate prior -- pm.Binomial buggy results...?

jbuddy_13 · November 3, 2021, 5:33pm

I’m running into what I think might be a bug with PyMC3’s Binomial likelihood. Here’s the example.

Simulate data

itrs = 100 
p = 0.4
W = []
L = []
N = []

for i in range(itrs):
  n = int(np.random.normal(loc=100, scale=20))
  wins = np.random.binomial(n=n,p=p)
  losses = n - wins
  N.append(n)
  W.append(wins)
  L.append(losses)

Define a model

with pm.Model() as m1:
  alpha = pm.HalfNormal('alpha',sigma=1)
  beta = pm.HalfNormal('beta',sigma=1)
  theta = pm.Beta('theta',alpha, beta)
  lik = pm.Binomial('likelihood',n=N,p=theta,observed=W)
  trace1 = pm.sample(chains=4)

Examine parameter estimates

pm.traceplot(trace1, compact=False)

Note: Look at the distribution of Theta.
4. Use mean inferred alpha and beta parameters to visualize P(\theta|X).

def sample_post_plot(samples=100, trace_obj=trace1):
  X = np.linspace(0,1,1000)  
  for i in range(samples):
    idx = random.randint(0,len(trace_obj.get_values('alpha')))
    a = trace_obj.get_values('alpha')[idx]
    b = trace_obj.get_values('beta')[idx]
    Y = stats.beta(a=a,b=b).pdf(X)
    plt.plot(X,Y)  

sample_post_plot(trace_obj=trace1)

Note: This is a completely different distribution than the theta posterior plot from step 3.
4. Reference expectation E[P(\theta|E[\alpha], E[\beta])]

np.average(np.random.beta(a=az.summary(trace1).to_dict()['mean']['alpha'],
               b=az.summary(trace1).to_dict()['mean']['beta'], size=1000))
>>>
0.4701658349492111

Note: So both the shape and expectation of P(\theta|\alpha, \beta) differ from PyMC3’s posterior plots. What’s perplexing to me is that E[P(\theta)]=0.403 is extremely close to the theta value we provided in step 1 (0.4). So posterior inference on theta is correct; it’s just that posterior inference on alpha and beta, is completely off.

What’s going on here? I’m hesitant to think this is truly a PyMC3 bug; probably more likely operator error.

Edit: I’ve discovered that this {alpha, beta, theta} disagreement, for lack of a better term, goes away when I use a Beta-Binomial likelihood instead. This post os already quite long, so for the sake of brevity, I’ve linked a Google Colab with all relevant code.

Nonetheless, I’m still really curious-- what is going on here?

ricardoV94 · November 3, 2021, 8:04pm

I might be glancing over the obvious issue, but I don’t think you have enough information to constrain your hyperparameters alpha and beta. You have one single theta with which they have to agree, so I would expect their posteriors to not be that much different from the priors.

Similarly your theta posterior would be much more uncertain if you had a single observation in the binomial (that is n=1).

Remember that the hyperpriors are conditionally independent from the likelihood, once we consider theta.

jbuddy_13 · November 3, 2021, 8:12pm

So, there are 100 [N,L,W] sets (N trials, L losses, W wins). So pm.Binomial is receiving the N and W arrays, each of length=100. When I swapped out the Binomial likelihood for Beta-Binomial likelihood, the issue went away. What I can’t figure out is… why?

For a more extensive conversation, see my Cross Validated Stack Exchange post.

ricardoV94 · November 3, 2021, 9:20pm

The beta binomial corresponds to a model where each of those binomial arrays is given it’s own theta prior. So your theta should have shape 100 for those to be equivalent models.

jbuddy_13 · November 3, 2021, 9:26pm

Just to be clear, you’re referring to pm.BetaBinomial(), right?

ricardoV94 · November 3, 2021, 9:26pm

That is:

with pm.Model() as m1:
  alpha = pm.HalfNormal('alpha',sigma=1)
  beta = pm.HalfNormal('beta',sigma=1)
  theta = pm.Beta('theta',alpha, beta, shape=len(W))
  lik = pm.Binomial('likelihood',n=N,p=theta,observed=W)
  trace1 = pm.sample(chains=4)

ricardoV94 · November 3, 2021, 9:27pm

Yes

jbuddy_13 · November 3, 2021, 9:39pm

So, your code used pm.Binomial() not pm.BetaBinomial(). I ran it as listed and it worked perfectly! Was this just a typo on this comment?

ricardoV94 · November 3, 2021, 9:54pm

I am not sure where did I make a typo. A beta-binomial distribution or model refers to this kind of model where each observation is modelled with a binomial likelihood whose p parameter is in turn given a unique beta prior that has shared alpha/beta hyper priors.

This:

with pm.Model() as m1:
  alpha = pm.HalfNormal('alpha',sigma=1)
  beta = pm.HalfNormal('beta',sigma=1)
  theta = pm.Beta('theta',alpha, beta, shape=len(W))
  lik = pm.Binomial('likelihood',n=N,p=theta,observed=W)

And this are therefore equivalent:

with pm.Model() as m1:
  alpha = pm.HalfNormal('alpha',sigma=1)
  beta = pm.HalfNormal('beta',sigma=1)
  lik = pm.BetaBinomial('likelihood',n=N,alpha=alpha, beta=beta,observed=W)

Except theta is marginalized out in the later

jbuddy_13 · November 3, 2021, 10:07pm

I get it now! Thanks!

Topic		Replies	Views
Sampler issues on Beta prior Binomial likelihood v5 bug	5	394	December 8, 2022
Pymc3 gives shifted result compared to analytical bayes solution v3	4	386	June 19, 2023
Beta-Binomial vs Binomial likelihood with Beta prior? Questions	1	690	January 29, 2021
Posterior Predictive distributions: beta-binomial models Questions	1	655	March 10, 2021
How to implement Beta Function effectively in Pymc3 Questions	8	1933	November 3, 2021

Beta-Binomial conjugate prior -- pm.Binomial buggy results...?

Related topics