Uncertainty of Model Predictions


#1

When using PPC for predictions, we always get a certain uncertainty caused by the MCMC estimations. Even in the following documentation, It is mentioned that these are not due to the underline model [1].

Note that these are uncertainty caused by MCMC estimation, and not from the underlying model. If we had taken more samples in sample_ppc, the error bars would get smaller quite quickly.

However, aren’t Bayesian models are capable of providing uncertainty of each predictions as a result of underline model architecture and fitting of the data (i.e. each prediction is a probability distribution)?

How can we estimate such uncertainty due to the fitting of the model using PyMC3?


#2

I think that sentence is a bit misleading.
What @colcarroll meant there is MC estimation, not MCMC. What he is trying to say in that sentence is that, we are showing the expectation of the posterior prediction distribution and its error using Monte Carlo sampling. If we use more samples the sample mean using the MC samples will just converge to the posterior expectation, and the error bar will get smaller as you use more MC samples (e.g., pm.sample_ppc(trace, sample=50000)).

It is important to distinguish between the uncertainty of the posterior and the uncertainty of the estimator - the former is a Bayesian concept while the later is a Frequentist concept. So when you are trying to evaluate the uncertainty of the prediction, keep in mind what kind of uncertainty you are trying to evaluate. It is somewhat difficult to show in this doc, as the uncertainty of a Bernoulli distribution is directly linked to the underlying parameter p - which is also the mean of the distribution.


#3

Thanks for clearing that up @junpenglao.

I observed that increasing the number of samples also reduces the error. That means the error given is a combination of uncertainties of both of the posterior and the estimation.

If we want to measure the uncertainty of the posterior, then we can use a very large number of PPC samples to estimate the posterior predictive distribution and use the 95% interval or variance of that distribution to determine the uncertainty of the posterior (assuming the error due to estimator is insignificant for large number of samples).

Am I correct?


#4

Yes and no. In this particular doc the change you are seeing reflects the change of uncertainty of the estimation, but not the posterior.

For a similar model you can estimated the posterior uncertainty by increasing PPC, because for Bernoulli the variance (or uncertainty) is linked to the mean, but in general it is not the case.

Think of a Gaussian, where we have better intuition:
Say you have a data generation process y \sim \text{Normal}(\mu, \sigma^2), with \mu and \sigma being the model parameters. When both parameters are fixed, the uncertainty of y is fully explain by the property of Gaussian, and its uncertainty is quantify by a function of \sigma (e.g., if you quantify the uncertainty as variance to the mean then it is just \sigma^2).
Now, say you have \sigma fixed and \mu from some distribution (in PPC, it would means \mu \sim \pi_{posterior}(...)), then the distribution for y becomes a result of a convolution (see Colah’s blog on this where in the beginning he gives an very intuitive representation). Which to answer your initial question, the posterior uncertainty is in \pi_{posterior}(...), where as PPC uncertainty is the convolution of \pi_{posterior}(...) and a Gaussian with the fixed \sigma

All these is to show you that, increasing PPC sample does not necessary give you the ability to quantify the uncertainty of posterior for \mu, but it can let you better estimate of the uncertainty of y.


#5

Great explanation. That is exactly what I wanted to know.

:smiley: thanks @junpenglao