Why is the noise included in the posterior predictive distribution in bayesian regression?

Assume the following model: y = b_0 + b_1 * x where we set some priors to b_0, b_1.

Let I denote our historical data and x^* denote future inputs.

Let p(b_0, b_1|I) denote our posteriors.

We can then define the posterior predictive distribution as p(y^*|I, x^*) = \int p(y^*|b_0, b_1, x^*)p(b_0, b_1|I).

Now my question is, why do we also sample the noise in the likelihood term p(y^*|b_0, b_1, x^*) and not just disregard it and just use the posterior samples to compute the desired quantity, does this procedure have a name?


This model is not complete as you have not specified the how the left and right sides are actually connected (e.g., the equality here seems to be false unless your data is extremely unrealistic). Typically, people slap on a “noise term” to this express to yield:

y=b0+b1∗x + \epsilon

Sometimes, if they are particularly pedantic, they might specify the form of \epsilon:

\epsilon \sim N(0,\sigma)

But most people just leave this off and pretend like it doesn’t exist. But it does! In a Bayesian context, it’s more conventional to write the same expression like this:

y \sim N(b0+b1∗x, \sigma)

or perhaps something like

\mu = b0+b1∗x \\ y \sim N(\mu, \sigma)

This makes it much clearer that y is a random variable and that the “noise” is an intrinsic part of your model. Now when you go to generate posterior predictive samples, you can calculate \hat{\mu} and use it to generate one “regression line” per posterior predictive sample. Or you can generate a set of \hat{y} for each posterior predictive sample. Which do you choose?

There is no universal “desired quantity” when one does a posterior predictive check. The user/analyst needs to figure out what exactly is being “checked” and generate quantities relevant for that goal.

So, to answer you question (I think), the procedure you are describing is just called a posterior predictive check. But here are many such checks. Check this notebook out for some examples.