Predict on unseen group in hierarchical model

Thanks to @lucianopaz’s talk about posterior predictions (Posterior Predictive Sampling in PyMC3 by Luciano Paz), I made some progress. Essentially, I changed the way to extract the posterior samples on the cluster level from the trace. Instead of extracting the values I need, I remove the traces I don’t need:

trace.remove_values('mu_t')
trace.remove_values('mu')
with model_factory(0, [1], [0], 3, 1):
    post_pred = pm.sample_posterior_predictive(trace)

This method was explained in the talk.

With this method, the model samples the mu for my unseen instance from the parameter distribution of the cluster it belongs to (cluster 1 in my sample code above).

However, when I predict for multiple unseen instances from the same cluster and observe the spread of mu values, it does not follow the distribution it is supposed to according to the model definition.
The model specifies that mu ~ N(a, sigma_a), so when I predict for multiple unseen instances, I would expect the spread of mu values across those instances to equal (approx.) sigma_a. However, the spread is much smaller.
Any idea what the cause could be?

1 Like