Interpretation of posterior predictive distribution

Hi PyMC community,

I have a perhaps quite a naive question I’m looking for clarification on. I’ve been reading into posterior predictive simulation and all the great applications its capable of, across the PyMC labs blogs and some of the thoughtful responses on the discourse, and want to make sure I have the concepts clear.

Say I have two variables that represent an objective and a subjective measure of performance on a task. They’re correlated imperfectly at around .5; but reach the significance threshold (so the field takes the relationship as a “given”!), and so there’s a broad perception that if individuals score high on one task, they score high on another. The objective task is slow and costly to administer, and your result on the task puts you into groups based on some thresholds. To get around this, its been proposed that people have insight into their ability on the task, so we can just ask them what they think they’d get - cheaper, faster, and “good enough”.

I’m thinking of leaning on posterior predictive simulation to draw attention to the implications of such a correlation at the level of the individual and why we should be more careful in our conclusions, but I want to make sure I’ve understood it correctly. Lets say that if you have a subjective score of -2, which would be a cutoff of interest on the objective task. Can I use the posterior predictive distribution to make a probabilistic claim about how likely that score is on the objective task, given a model and some data? Some code that matches by use case is below which hopefully makes my rambling clear:

import arviz as az
import numpy as np
import pymc as pm

rng = np.random.default_rng(35)

# Draw data
n = 200
subjective = pm.draw(pm.Normal.dist(), n, random_seed=rng)
objective = pm.draw(pm.Normal.dist(mu=subjective*0.52, sigma=0.83), random_seed=rng)

# Model
with pm.Model() as test:
    # Data
    X = pm.MutableData('X', subjective)
    Y = pm.MutableData('Y', objective)
    # Parameters
    β0 = pm.Normal('β0', mu=0, sigma=1)
    β1 = pm.Normal('β1', mu=0, sigma=1)
    σ = pm.HalfNormal('σ', sigma=1)
              mu=β0 + β1*X,
    idata = pm.sample(draws=500, tune=1000)

# Set new data, draw posterior predictive
with test:
    pm.set_data({"X": [-2], "Y": [0]})
    pm.sample_posterior_predictive(idata, random_seed=rng, predictions=True, extend_inferencedata=True)

# Plot the prediction, highlight -2 probability
az.plot_posterior(idata, group='predictions', var_names='y', ref_val=-2)


I guess my question is really - what is the correct interpretation of this prediction? Is it given the model and the posterior, a likely future, unobserved objective score for a person with -2 on the subjective is on average -0.93, ranges from -2.5 to 0.64; and the probability of it being -2 or less is about 10%?

I’ve seen some examples where whole datasets of varying sizes are generated from the posterior predictive sample, but this doesn’t seem applicable here, only about a single new datapoint and its possible future values. Am I correct in this or way off?

Tagging @tcapretto, @ricardoV94, @AlexAndorra, and @jessegrabowski who have written some great stuff on posterior predictive sampling in the past.

Thanks so much for reading!


That’s the idea. The conventional practice of producing a posterior predictive distribution for the observed data (the data originally used for inference) is to evaluate whether you model+your posterior distribution over the model parameters can (approximately) reproduce the data set. If not, then you should question both your model and the posterior. But this is all a method for doing diagnostics on your model. You can also use your model+posterior to generate predictions about new scenarios, often in the form of new predictor values (e.g, see this notebook on on out of sample predictions). But if you want to generate a prediction for a scenario in which there is a single, hypothetical person, that’s fine. Or, if you wanted, you could generate predictions about a whole range of hypothetical people in a single go. The possibilities are endless!


Thanks @cluhmann, thats super helpful to get things straight in my head and make the point I want to make! Its such a powerful idea, and so useful.

Honestly, I find the term “posterior predictive checking” and “posterior predictive” to be a bit confusing. When trying to evaluate your model/posterior in the diagnostic mode, the term “posterior predictive checking” makes sense (it’s a form of model checking). But there’s a huge number of things you can use your posterior for and I think of these sorts of exercises as “simulation”. And when you talk about wanting to use your posterior to “simulate” new scenarios, it’s a bit more intuitive that there aren’t really any rules about what you should be simulating.

That’s a useful distinction. I was tied in knots trying to separate the idea of model checking via a posterior predictive simulation, and general prediction. I’ve seen some refer to the PPC as a “retrodictive” check, and that makes more theoretical sense to me. But now much clearer that pm.sample_posterior_predictive has many uses! Thanks again!

1 Like