Bayesian Regression: Inconsistent Prior Predictive Checks

Simon · October 4, 2023, 5:08pm

That’s the point of prior predictive checks, so you can calibrate your priors before running the model. The problem is that your approach goes the opposite way, you’re trying to make your distributions fit on pre-assigned values to prior parameters (obtained from point estimates). That’s not going to work. LogNormal distributions have very long tails, so a great amount amount of zero values happen towards the tail of the distribution, which will push your mean to zero. That does not necessarily mean that the model is worse at inference (same for WAIC or LOO model comparison, which are more indicative of predictive capacity). However, a more constrained model could be better for theoretical reasons (again that’s what you must decide according to your domain knowledge).

A Normal distribution is definitely inappropriate, as it allows for negative values, but maybe a TruncatedNormal could work better. It’s also better to start the prior predictive checks from less informative priors, unless they come from very clear theoretical knowledge, or from a previous Bayesian model (prior update).

import numpy as np
import matplotlib.pyplot as plt

r = 0.8
alpha = 4
sigma = 0.8

x0 = np.sort(pm.draw(pm.LogNormal.dist(alpha, sigma), draws=100, random_seed=27))
x = np.sort(pm.draw(pm.LogNormal.dist(alpha, sigma), draws=100, random_seed=77))

y = r*x + np.sqrt(1-r**2)*x0  

with pm.Model() as mod:
    alpha_m = pm.HalfNormal("alpha", 1) #intercept
    beta_m = pm.HalfNormal("beta", 1) #slope
    mu = pm.Deterministic("mu", alpha_m + beta_m*x) #mean
    sigma = pm.HalfNormal("sigma", 10) #sd
    y_hat = pm.TruncatedNormal("y", mu=mu, sigma=sigma, lower=0, observed=y)

with mod:
    prior_pred = pm.sample_prior_predictive() 

az.plot_ppc(prior_pred, group="prior")
az.plot_ppc(prior_pred, group="prior", kind="cumulative")

Which definitely looks better than the LogNormal, and allows you to have a more appropriate likelihood (not negative values). Now, it’s important to keep in mind that the simulated data comes from a LogNormal distribution, and that’s why my model in the previous example can recover the parameters with reasonable precision. The TruncatedNormal model does a reasonable job for the mean (alpha), but it’s a bit off for the sd (sigma).

Comparing the 3 models seems to be useful as well (though often not recommended as a method for model selection), but note that now the Normal model has 0 weight and also the LogNormal gives a warning (something may be miss-specified):

	rank	elpd_loo	p_loo	elpd_diff	weight	se	dse	warning	scale
TruncatedNormal	0	-362.1440699	9.12993812	0	0.962512852	15.11492559	0	FALSE	log
Normal	1	-362.8778718	9.6356368	0.733801882	0	15.43134132	0.521035978	FALSE	log
LogNormal	2	-480.1654312	6.444936509	118.0213612	0.037487148	12.77992679	14.71650608	TRUE	log

An important note, is that the point of assessing your models with toy data (or generative model simulations) is to test whether you can retrieve the original parameters by using relatively uninformative priors in your model. So, you should not use the same alpha for your HalfNormal priors. Also beta_m is a prior for the slope, so using alpha there does not seem too appropriate. Additionally, HalfNormal’s default parameter is a scale parameter (sigma), so not the same as the location parameter alpha you used for the LogNormal. All these details seem a bit nitpicky, but ultimately knowing these details very well can greatly help with prior selection. Hope this helps.

Topic		Replies	Views
How to interpret posterior/prior predictive checks v5 modeling , prediction , model-checking	2	614	September 25, 2023
Some misunderstandings with Prior predictive checks Questions	7	1493	March 21, 2023
Why does this Bayesian regression analysis fail? version agnostic	6	494	June 6, 2023
How does prior predictive work when parameters depend on 'input data', x?	2	258	January 16, 2024
Help model structure v5 modeling	5	360	February 22, 2024

Bayesian Regression: Inconsistent Prior Predictive Checks

Related topics