Samples from Conditional Posterior Distribution

Giorgos_Nikola · April 3, 2022, 10:13pm

Let us consider the following Hierarchical Bayesian model:

w \sim\ Beta(20, 20)
K = 6
a = w * (K - 2) + 1
b = (1 - w) * (K - 2) + 1
theta \sim\ Beta(a, b)
y \sim\ Bern(theta)

The above is the example of figure 9.3 in the book Doing Bayesian Data Analysis by John Kruschke.

I use the pymc3 to construct the model as follows:

import numpy as np
import pymc3 as pm
import arviz as az

K = 6
D = np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0])

with pm.Model() as mdl:
    
    w = pm.Beta('w', alpha=20., beta =20.)
    a = w * (K - 2) + 1
    b = (1 - w) * (K - 2) + 1
    theta = pm.Beta('theta', alpha=a, beta=b)
    y = pm.Bernoulli('y', p=theta, observed=D)
    trace = pm.sample(10000, tune=3000)

My question is the following:

How can I compute the conditional posterior probability p(theta|w=0.25,D)? In other words how can I obtain samples from the conditional posterior distribution of theta|w=0.25, D?

cluhmann · April 6, 2022, 3:33am

Can you just set w=0.25?

jessegrabowski · April 6, 2022, 6:42am

You can indeed just set w=0.25 and re-sample the model. In this case the model is simple and sampling is cheap, so it’s probably the way to go. I just wanted to add that if the model were expensive to sample, you could also get conditional probabilities from the joint posterior using boolean masks. The only wrinkle is that, since w is a continuous variable, you’re not going to have any samples at exactly w=0.25 from which to construct the conditional distribution. To get around this, I binned the values of w into chunks of width 0.1, so I built p(\theta | w \in [0.2, 0.3), D):

bins = np.linspace(0.2, 0.8, 7)
axes = az.plot_pair(trace, var_names=['w', 'theta'], marginals=True)
main_plot = axes[1][0]
ymin, ymax = main_plot.get_ylim()
main_plot.vlines(bins, ymin=ymin, ymax=ymax, ls='--', color='k', zorder=3)

The conditional probability you are interested in is the cluster of points in the first bin, between 0.2 and 0.3. It’s pretty far out on the tail, so there isn’t much to go on (hence the wider bin size).

You can grab values of theta in that bin, and then sample from these points as your conditional posterior:

w_bins = np.digitize(post['w'], bins=bins)
theta_given_w25 = post['theta'][w_bins == 1]

For comparison, this “tabular” method gives the conditional posterior mean of theta as 0.6141472 and std of 0.10960547, while resampling the model with w=0.25 gives the mean as 0.611 and std of 0.113, so you can see they are roughly equivalent.

Giorgos_Nikola · April 7, 2022, 12:00pm

Thank you very much for the great answer.

I fully understand the second way using boolean masks. Actually that is what I thought at first. The problem is if I wanted the conditional probability at w = 0.10 where I haven’t any samples?

Also, by re-sample the model do you mean to re-run the full code by setting w = 0.25?

jessegrabowski · April 7, 2022, 12:45pm

The conditional probability at w = 0.1 might just be 0. Thems the breaks.

Exactly, replace w = pm.Beta('w', alpha=20., beta =20.) with w = 0.25. The sampler started complaining about convergence when I did it, which is normal because we know we’re in a very flat region of the likelihood space (i.e. there’s not much information out there for the sampler to go on). I imagine if you put in w=0.1 it will crash and burn. But it’s worth a try!

Giorgos_Nikola · April 7, 2022, 1:06pm

Thank you very much.

When I run the sampler with w = 0.10 and computed the average of the samples actually it returned 0.5779 which is very very close to the value computed from grid approximation (0.5778) .

Topic		Replies	Views
Samples from Marginal Posterior Distribution Questions	2	864	March 6, 2022
Determing the probability distribution of coefficients given a particular value v5 modeling	2	388	January 19, 2023
Posterior Predictive Sampling -- Works for Hierarchical Model?	3	2154	June 12, 2019
Calculating conditional posterior predictive samples in high-dimensional observation spaces v5 modeling	13	706	July 7, 2023
How to get the probabilities of this PyMC3 model? Questions	3	829	March 9, 2020

Samples from Conditional Posterior Distribution

Related topics