How to verify that uncertainty (estimated from pymc3) is accurate?

The paper I linked was for unconditional models, so I’ll switch here to the notation for linear regression with y conditional on x.

  1. This calibration assessment only works on data generated so that you know the true parameters. In the case of linear regression you can use whatever x you want, then for M trials independently generate your own theta_i ~ p(theta) and y_i ~ p(y | x, theta).

  2. Then given the posterior for a trial, p(theta | x_i, y_i), you need to calculate the value of the true parameter theta_i in that posterior’s CDF, i.e. U_i = F_i(theta_i). If you have the closed form CDF you can use numpy’s quantile method. Or if you have enough samples from the posterior you can calculate the empirical quantile as the rank order of theta_i within the sorted samples. One way is U_i = np.sum(samples < true_params_values) / num_samples.

  3. Once you have the U_{1:M} you can get a summary statistic using the Komogorov-Smirnov goodness-of-fit test, which essentially tests the coverage of every credible interval. You can qualitatively examine empirical CDF or quantile-quantile (QQ) plots as in Figure 1b in the linked paper. The larger M the more smooth those curves will be; 20 may start to get you a feel for what the curve will look like, and maybe it will start to clean up around 100 and get extra clean around 500 or 1000 trials? Probably depends on the data and problem.

1 Like