Hi,
I have a question for clarify my mind.
when we do inference on a model (pm.sample_posterior_predictive(trace)
), we got a set of (standard value) 1000 arrays. Each array is built of a number of points.
My question: why the points building each array are not following a smooth trajectory?
I thought that each array is obtained by two steps:
- making a random sampling on the posterior distributions, to get parameters value;
- calculate the points following the formula given in the model, with the parameters extracted.
Or is, maybe, each point obtained by a random sampling, and arranged in the matrix only for organizations purposes?
Thank you
Hi,
Hard to tell without an example, but I think I see what you’re asking: posterior predictive samples do not only sample from the posterior parameters; they also sample from the likelihood. This adds a layer of uncertainty, and explains why the samples are not following a “smooth trajectory”, as you say.
For a linear regression for instance, pm.sample_posterior_predictive
doesn’t only sample from the mean, which is a combination of slope and intercept (mu = a + b*x
) but it also samples from the likelihood of the data: pm.Normal(mu = a + b*x, sigma=sigma, observed=obs_data)
. You get “new data”, or retrodictions, as a results.
Is that clearer?
So the sampling is for every x
on the likelihood. Got it, thank you!
Yeah, pm.sample_posterior_predictive
basically samples the posterior parameters 500 times by default. So, if you had 10 data points, your posterior predictive samples should be of shape (500, 10)
.