Bayesian Regression: Inconsistent Prior Predictive Checks

J_V · October 9, 2023, 8:22am

“The problem is that your approach goes the opposite way, you’re trying to make your distributions fit on pre-assigned values to prior parameters (obtained from point estimates). That’s not going to work.”

I don’t understand why it’s not going to work. If I’m providing all the information to the model (the pre-assigned values), shouldn’t the model perform even better? There shouldn’t be any issues with recovering what it already knows.

“LogNormal distributions have very long tails, so a great amount of zero values happen towards the tail of the distribution, which will push your mean to zero.”

That makes sense; I need to find a distribution with a shorter tail or less weight in the tail. I’m currently searching for such a distribution.

“It’s also better to start the prior predictive checks from less informative priors, unless they come from very clear theoretical knowledge, or from a previous Bayesian model (prior update).”

Well, I suppose what I’m attempting to do is a “Bayesian prior update,” but I’m clearly going about it the wrong way. I’ve also tried simulating this update by considering a scenario where I only have four samples and then gradually increasing the sample size to 100 to see how it improves the model. The conclusion I’ve drawn is that uncertainty decreases and the R-squared value increases when calculated from the mean of the predicted data compared to the actual observations, nothing new here. Which other metrics could I use to evaluate the sample size effect in my model?

“An important note is that the point of assessing your models with toy data (or generative model simulations) is to test whether you can retrieve the original parameters by using relatively uninformative priors in your model. So, you should not use the same alpha for your HalfNormal priors. Also, beta_m is a prior for the slope, so using alpha there does not seem too appropriate. Additionally, HalfNormal’s default parameter is a scale parameter (sigma), so it’s not the same as the location parameter alpha you used for the LogNormal. All these details may seem a bit nitpicky, but understanding them well can greatly help with prior selection. Hope this helps.”

Okay, now I see why it’s not advisable to provide pre-assigned values to the model as mentioned in the first paragraph. These details you’ve shared are extremely useful;
Thank you for these great instructions!

P.S. I’m writing all my thoughts here because they might be useful for other beginners in Bayesian approaches.

Topic		Replies	Views
How to interpret posterior/prior predictive checks v5 modeling , prediction , model-checking	2	549	September 25, 2023
Odd results in model prediction using pymc.sample_posterior_predictive v5 linear_model , modeling	9	1205	September 24, 2022
Unsure how to proceed with prior and posterior predictive checking for Bayesian multiple logistic regression v5 arviz , model-checking	6	1125	July 6, 2023
Best Practices for Diagnostic Checks in Bayesian Linear and Generalized Linear Models	1	42	June 6, 2025
Observed and Simulated data difference and prediction of unseen data v5	0	308	August 2, 2023

Bayesian Regression: Inconsistent Prior Predictive Checks

Related topics