Pm.sample() - How can the posterior be defined when there is no observed data to calculate the likelihood?

gms101 · November 30, 2022, 12:27pm

I am of the impression from the bayes’ formula that, to get a posterior probability, we need the prior probability and the likelihood of the observed values.

Having said that, if we define a model “without” passing the “observed” data and then do pm.sample(), the documentation[here] says that it samples from the posterior. So, how is it so that without “observed” data, we get posterior distribution?

ricardoV94 · November 30, 2022, 12:59pm

In that case it samples from the prior. The documentation is just explaining the most common use which is to call it on a model with observed nodes (or a Potential that corresponds to the likelihood)

Gireesh_Ramji · December 1, 2022, 1:29am

Had the same question @gms101 and thanks for the response @ricardoV94

One follow up question:
When we call pm.sample without observed data, the returned idata instance contains “posterior” and “sample_stats”. Given that we are sampling from the prior, would it make more sense to return an idata instance with “prior” populater, similar to what is obtained when calling sample_posterior_predictive?

In other words, should we expect the same behavior when pm.sample is queried with no observed data, as we would when calling pm.sample_prior_predictive?

bwengals · December 1, 2022, 6:52am

I think this might just be about semantics, but when there is no observed data in the likelihood, pm.sample still is returning the posterior distribution. It’s just that because there was no observed data the posterior = prior, because that’s how Bayesian updating works. If you have a prior, and then you observe no data to update the prior, the “new” posterior must be the same as the prior. Anything else wouldn’t make sense!

In other words, should we expect the same behavior when pm.sample is queried with no observed data, as we would when calling pm.sample_prior_predictive?

Yes, but it’ll be called (I think correctly) posterior in the trace.

ricardoV94 · December 1, 2022, 6:59am

This question came up once in the repository: Can a model produce posterior samples without an observed kwarg in the model (and no potential either)? · Issue #6179 · pymc-devs/pymc · GitHub

My take there was that we can’t really distinguish a prior from a posterior. So we just go with the most common use. pm.sample can be perfectly used for prior, posterior and posterior predictive sampling. We just don’t want to bother the user with specifying which one it is. They can change the InferenceData group easily.

Sometimes, but not always. If there are transforms that distort the prior like ordered or sumto1 in a variable, pm.sample will provide different (and correct) draws from the prior whereas prior predictive won’t. Similarly Potentials are only taken into account in pm.sample. Otherwise, yes they are equivalent.

Speaking of Potentials, the are also ambiguous as to wether they correspond to prior terms, likelihood or both. Therefore we can’t know if a model without observations but potentials corresponds to prior or posterior.

The function names although useful for beginners are a bit misleading. The real distinction is the predictive ones are doing forward/ ancestral sampling while pm.sample is doing mcmc sampling.

Gireesh_Ramji · December 7, 2022, 5:40am

Thank you @ricardoV94 and @bwengals, things are a lot clearer now.

Gireesh_Ramji · December 7, 2022, 6:59am

@ricardoV94 just to confirm my understanding, when you say “forward / ancestral sampling” and do you mean that “forward sampling” and “ancestral sampling” are the same thing? I can see from the discussion on this issue, that there has been some debate on this previously - just seeking clarification on this point.

ricardoV94 · December 7, 2022, 7:02am

Yes, both refer to the same idea of taking random draws from the ancestral nodes and propagating those downstream/forward to other nodes that depend on them.

gms101 · December 7, 2022, 10:23am

Thanks @ricardoV94 for the explanation. I have a follow-up question on the same.

Sometimes, but not always. If there are transforms that distort the prior like ordered or sumto1 in a variable, pm.sample will provide different (and correct) draws from the prior whereas prior predictive won’t.

Could you help me understand how ordered and sumto1 transformation would distort the prior when using pm.sample() and not when using pm.sample_prior_predictive()? An example would help.

Topic		Replies	Views
Why is a posterior created in pm.Model() when no data are given? Questions modeling	6	484	May 3, 2023
Sampling from prior predictive distribution Questions	13	6163	August 18, 2021
Is there any material gently explain how PYMC3 class/function works Questions	2	637	June 29, 2019
Sample from prior? Questions	9	3443	July 20, 2018
What if I don't specify any parameter 'observed'? Questions	8	466	August 18, 2021

Pm.sample() - How can the posterior be defined when there is no observed data to calculate the likelihood?

Related topics