Posterior Predictive Sampling -- Works for Hierarchical Model?

rpgoldman · June 12, 2019, 5:55pm

In the code for sample_prior_predictive, if I understand it correctly, PyMC3 picks points from the trace, and then collects samples from the output variables (or, more generally, the variables the caller specifies), given the values of the variables in the point.
There’s a notebook that shows how to use this with a regression model, which is able to generalize (predict values for unseen combinations of predictors).
Again, IIUC, this works because of the following dependency structure:

The key feature here is that the regression weights are independent of the predictors. So we can sample from the posterior predictive by pulling a point (the variables in the upper right of the above figure), and from them, the independent predictors, and the deterministic formulas, we can generate random observations.
But what happens if we have a hierarchical model that has a structure with the following shape:

For example, maybe there’s a categorical variable, representing cluster membership, that is sampled with a conditional probability conditioned on the predictor variables. In this case, the latent value in the center is dependent upon the predictor values, and should be chosen from the posterior based on the predictor values, rather than plucked from the trace independent of the predictors.
In this case, won’t sample_posterior_predictive() generate incorrect samples? Or have I misread something?
If I am right, then PyMC3 cannot generate posterior predictions of unseen predictor combinations in cases like the hierarchical model. PyMC3 can only generate posterior predictions correctly if the parameters are all independent of the predictors.

lucianopaz · June 12, 2019, 7:25pm

Yes, you’re right. In this case the samples will not match what should be expected given the conditional dependency between the predictors and latent variables.

This doesn’t mean the hierarchical regression models usually used in pymc3 are wrong. The model structure is something like

\mu_{0} \sim Normal(...) \\ \beta_{i} \sim Normal(mu,...) \\ y \sim Normal(X.\vec{\beta},...)

So the parameters are all independent from the predictors.

Then in gaussian mixture models, the latent variables that represent class membership are marginalized out, so again no problems with posterior predictive sampling.

In the cases where you do collect samples from the latent variable, the solution to perform correct out of sample predictions is to not supply the entire trace to sample_posterior_predictive. You can pass a list of point dictionaries instead of a trace, and you just need to drop some keys from said dictionary (namely the latent variables and the nodes that conditionally depend on them). This is easily done converting the trace to a pandas dataframe:

    trace = pm.sample()
    df = pm.trace_to_dataframe(trace,
                               varnames=[the variables you want],
                               include_transformed=True)
    # We have to supply the samples kwarg because it cannot be inferred if the
    # input trace is not a MultiTrace instance
    ppc = pm.sample_posterior_predictive(trace=df.to_dict('records'),
                                         samples=len(df))

rpgoldman · June 12, 2019, 7:46pm

Thanks. That’s very helpful. It does mean, though, that the notebook about posterior predictive sampling avoids the space where things are tricky, so users might try to do things with sample_posterior_predictive() that will quietly yield bad results.
I wish I knew how to formulate an explanation of this process to put in the documentation.

lucianopaz · June 12, 2019, 7:57pm

I’m really short on time. I still need to write down a notebook on shape handling and shape_utils, but I’ll try to push some things onto the posterior predictive sampling notebook when I get the chance.

There are some threads here that talk about posterior predictive sampling that can serve as material for the notebooks

Topic		Replies	Views
Predict on unseen group in hierarchical model Questions	6	1556	November 9, 2020
Forecasting hierarchical models with sample_posterior_predictive Questions	1	1247	August 14, 2019
How do we predict on new unseen groups in a hierarchical model in PyMC3? Questions	10	5384	September 12, 2022
Prediction using sample_ppc in Hierarchical model Questions from_github	6	4973	December 14, 2017
Posterior Predictive Sampling in PyMC3 by Luciano Paz PyMCon2020	2	1619	June 6, 2024

Posterior Predictive Sampling -- Works for Hierarchical Model?

Related topics