Prior, prior predictive and sample_from_observed

I was skimming through the code in pymc4.sample_prior_predictive, and I wanted to clarify some concepts.

I realized that it always returns all the results in the prior_predictive group of the inference data group, which I think will be confusing for users (especially as the importance of both prior and prior predictive checks increases) and can also make harder to use all of ArviZ features.

Would it be possible to divide the variables into prior and prior_predictive groups in the same way as variables are divided between posterior and posterior_predictive?

I have found that conceptually distinguishing between prior and prior_predictive is generally harder than between posterior and posterior_predictive, and keeping them combined in PyMC4 will probably keep the confusion alive. Below I list the two main arguments that came to mind when keeping both quantities combined, because I am not sure I completely grasp the whole situation.

I know that both quantities can be sampled at the same time and therefore doing something like prior = pm.sample_prior(model); prior_pred = pm.sample_prior_predictive(prior, model) is not efficient at all. However, both quantities can be sampled at the same time and still be stored each in the corresponding group of the resulting inference data. When the return value is an inference data object, computational efficiency and storing them in different groups seem perfectly compatible.

I have seen the argument sample_from_observed which may difficult distinguishing between the two quantities, however, I have not been able to understand what it does conceptually. To me, neither of prior=p(\theta) nor prior_predictive=\int p(y^*|\theta) p(\theta) d\theta know about the observed data y, so I can’t wrap my head around what is computed by pm.sample_prior_predictive with sample_from_observed=False. We only get samples from \theta (prior/posterior variables) and their distribution is somehow conditional to the observed data y but it clearly isn’t the posterior. :thinking:

2 Likes