Why is a posterior created in pm.Model() when no data are given?

I’m trying to practice PyMC but the least we can say is that I’m not very comfortable interpreting the results…
For example, I don’t understand this: I create a very small model where I just set my prior (let’s say my test hasn’t started yet and I don’t have any data yet). So, here it is:

import pymc as pm

with pm.Model() as coin_flip_model:

# Define a prior for the parameter
prior_p = pm.Beta("p", alpha = 0.5, beta = 0.5)

with coin_flip_model:

# Create inference data
model_trace = pm.sample()

And now, my inference data model_trace contains a posterior:
model_trace
I thought that a posterior only made sense after having carried out an experiment and obtained data…
In the case I’m exposing, how can a posterior be created? Something escapes me…

We do that out of convenience since it’s 99.99% of the use cases. You can use pm.sample to get anything from prior to posterior to posterior predictive, but we don’t want to ask users which group they want to put it in since it’s almost always posterior.

Archived discussion here: Can a model produce posterior samples without an observed kwarg in the model (and no potential either)? · Issue #6179 · pymc-devs/pymc · GitHub

Well, I understand the point, but it’s still quite destabilizing.

For example, I can plot the p values from the so-called posterior group and get the following plot:
p_post
It looks like sampling from the Beta(0.5, 0.5) prior…, which it is, obviously.

On the other side, if I want to include a prior group in my InferenceData, I need (or maybe not?) to run:

with coin_flip_model:

model_trace = pm.sample()
model_prior = pm.sample_prior_predictive()

model_trace.extend(model_prior)

And, in that case, I get p values form both prior and posterior groups. But are these p values form that prior really predictive? Actually predictive of what, in that case, with no data included?

You see, it seems to me that there are always things that are not very clear in the notations…(even if it’s maybe not very important nitpicking)

I’m not sure I agree that the posterior without any data is not clearly defined. You can still compute P(y | theta), even if y contains no values. In that case that will just be 1, and the posterior will be exactly the prior, which is what pm.sample() gives you.

Hey Andre. The behavior you are seeing is consistent with the docstring of pm.sample:

Draw samples from the posterior using the given step methods.

So if you call sample you should expect posterior draws. Perhaps we could raise an exception if you call sample with a model that does not have a likelihood (as we do when there are no free variables). We’ve also discussed renaming sample to something like sample_posterior to make things extra clear, but it did not gain much traction. Feel free to submit an issue with a feature request if you have an idea about how it should behave.

Thanks for having clarified this point! No need to submit an issue. This way of systematically creating a posterior group, even in the absence of data, had seemed a little strange to me, but, basically, it remains logical: if there is no data, posterior = prior, posterior predictive = prior predictive, and you just have to know it…, right?

2 Likes

This much is true and should make sense. Imagine you have beliefs (your prior). You I present you with an data set, but the data set contains no data. I then ask you what your beliefs are (your posterior). You should report that your “new” beliefs are identical to your “old” beliefs.

When you ask for a prior/posterior predictive sample, you are asking the for samples from the prior/posterior to be pushed through the model with the ultimate output being a new, synthetic version of all observed variables. When you have no data, you have no observed variables. So when you ask for a prior/posterior predictive sample, it does what it can, pushing samples from the prior/posterior to be pushed through the model and generating all intermediate quantities. You can figure this out because the result of, for example, pm.prior_predictive() will not have a prior_predictive group in it.