Prior, prior predictive and sample_from_observed

OriolAbril · April 18, 2020, 8:01pm

I was skimming through the code in pymc4.sample_prior_predictive, and I wanted to clarify some concepts.

I realized that it always returns all the results in the prior_predictive group of the inference data group, which I think will be confusing for users (especially as the importance of both prior and prior predictive checks increases) and can also make harder to use all of ArviZ features.

Would it be possible to divide the variables into prior and prior_predictive groups in the same way as variables are divided between posterior and posterior_predictive?

I have found that conceptually distinguishing between prior and prior_predictive is generally harder than between posterior and posterior_predictive, and keeping them combined in PyMC4 will probably keep the confusion alive. Below I list the two main arguments that came to mind when keeping both quantities combined, because I am not sure I completely grasp the whole situation.

I know that both quantities can be sampled at the same time and therefore doing something like prior = pm.sample_prior(model); prior_pred = pm.sample_prior_predictive(prior, model) is not efficient at all. However, both quantities can be sampled at the same time and still be stored each in the corresponding group of the resulting inference data. When the return value is an inference data object, computational efficiency and storing them in different groups seem perfectly compatible.

I have seen the argument sample_from_observed which may difficult distinguishing between the two quantities, however, I have not been able to understand what it does conceptually. To me, neither of prior=p(\theta) nor prior_predictive=\int p(y^*|\theta) p(\theta) d\theta know about the observed data y, so I can’t wrap my head around what is computed by pm.sample_prior_predictive with sample_from_observed=False. We only get samples from \theta (prior/posterior variables) and their distribution is somehow conditional to the observed data y but it clearly isn’t the posterior.

Topic		Replies	Views
InferenceData incomplete Questions arviz	12	575	March 17, 2023
How to make InferenceData returned by sample() aware of the prior and posterior_predictive Questions	3	819	September 8, 2021
Sample_posterior_predictive using prior instead of posterior parameters? v5 bug , modeling	0	36	August 27, 2024
Trace from Pymc3 being used in Pymc 4.0 v5	11	911	June 10, 2022
Model shape and best practice for sampling prior predictive on model without observed data v5	5	1175	April 3, 2023

Prior, prior predictive and sample_from_observed

Related topics