Why does this Bayesian regression analysis fail?

jessegrabowski · June 5, 2023, 10:33am

I think this is a scale problem. Your Date variable takes values that are quite large, in the thousands. So if you think about drawing a line back from your data points (around 30) to the x-intercept, it will take a while to get there. What I’m trying to say is that the intercept is likely to be a large negative number. Computing the OLS coefficients can be instructive:

X = np.c_[np.ones(7), df.Date.values]
np.linalg.solve(X.T @ X, X.T @ df.Height)

>>>Out: np.array([-302.9709337     0.28391778])

So the intercept is -300, which according to your prior has a probability very close to zero (0.0008 i believe). Since you’re ruling out intercepts that would allow slopes consistent with your data, the model compromises by abandoning the slope and increasing the uncertainty around a flat intercept.

You can solve this by one of two ways. First, you could crank up the sigma on the intercept prior so that it allows values like -300, or even switch it to something fat-tailed, like a Cauchy. This works fine, but a more general solution (and the recommended one by pretty much everyone) is to scale your inputs. Instead of fitting height ~ a + b * date, fit height ~ a + b * z_date, where z_date = (df.Date - df.Date.mean()) / df.Date.std() . By normalizing your inputs to have zero mean and unit variance, it makes it much easier to reason about priors. It also helps the sampler cruise along in many cases. See this discussion, for example.

Topic		Replies	Views
Prediction by Bayesian linear regression Questions	7	1794	April 12, 2021
(Inverse?) Linear Regression - predicting the indepednent variable via two applications of Bayes rule Questions	0	362	June 23, 2022
Question about how to force y_pred to positive numbers and advice seeking on my exercises v3 modeling	2	621	May 15, 2023
Odd results in model prediction using pymc.sample_posterior_predictive v5 linear_model , modeling	9	1205	September 24, 2022
Help Interpreting Hierarchical Linear Regression Results Questions	0	588	March 1, 2021

Why does this Bayesian regression analysis fail?

Related topics