I’m a little confused because this question is not of great scientific interest and rather relates to a lack of knowledge on my part.
I am trying to understand how to use a posterior predictive check (
PPC) after building a bayesian model using the
PyMC library. I’ve never done this before and I don’t quite understand how I should proceed…
Concretely, I would like to be able to generate a large number of samples from the posterior predictive distribution and compare them to the sample I used to build the model, using a simple criterion: the maximum value of the samples. I refer for this to Ben Lambert’s presentation video (What is a posterior predictive check and why is it useful? - YouTube) (see at 8:15) where he presents the following graph, which perfectly expresses what I would like to get:
So I built a simple model, where I gave myself a sample of 20 values that are supposed to come from a normal distribution with mean
mu and standard deviation
sigma that I am trying to estimate. (note that I gave myself a
mu and a
sigma to generate the sample which is the basis of the model).
I then built the code referring to various examples and included the
sample_posterior_predictive instructions according to the information given by the
API — PyMC 5.1.1 documentation (API — PyMC 5.1.2 documentation) in the
“samplers” paragraph The code works (on my computer) and does not generate errors.
On the other hand, I am looking but I cannot find how to concretely use the posterior predictive distribution from my program and generate the histogram of the maximum values of the samples drawn from this distribution.
Well, I am training myself in Bayesian analysis and have a hard time manipulating and using PyMC outputs which I am not yet familiar with; If you could give me some pointers…
The code used is:
import numpy as np import pymc as pm import arviz as az np.random.seed(63123) data = np.random.normal(loc = 600, scale = 30, size = 20) with pm.Model() as model: # Priors for unknown model parameters mu = pm.Uniform('mu', lower = 0, upper = 2000) sigma = pm.Uniform('sigma', lower = 0, upper = 100) # Likelihood of observations lkd = pm.Normal('likelihood', mu = mu, sigma = sigma, observed = data) # Expected value of outcome (WHAT IS THE USE ???) weights = pm.Normal('weights', mu = mu, sigma = sigma) # Inference data idata = pm.sample() # Predictives pm.sample_prior_predictive(samples=500, var_names=None, random_seed=None, return_inferencedata=True, idata_kwargs=None, compile_kwargs=None) pm.sample_posterior_predictive(idata, var_names=None, sample_dims=None, random_seed=None, progressbar=True, return_inferencedata=True, extend_inferencedata=True, predictions=True, idata_kwargs=None, compile_kwargs=None)