Bayesian prudence or basic uncertainty management

Hi all!

TL;DR

When I’m working with a model that I know is too simple (either because I’m at the first stages of an iterative process or because of external constraints) I would like to incorporate my belief in the model inaccuracy somehow in the estimations. Something like widening credible intervals to account for that distrust.

How would you go about it? I think this is not part of standard practice, perhaps because it can easily slip into arbitrary decision making, but there may be more to it. What’s your take?

Toy example

Let’s say I’m modelling delays, and I’m working with the simplest model possible, a fixed scale exponential. I’m quite certain that the scale is actually time varying, but I don’t have the time or computational resources (imagine it’s not just a exponential but a much more complex model to begin with) to handle that.

For simplicity, in the example below delays come from just two different scales, but I’m using a single scale to model them. The idea: the model will be wrong, but not more wrong than having no model at all, so it makes sense to deploy it first and take advantage of its value while better approximations are developed.

Model code:

import numpy as np
import pymc as pm
import matplotlib.pyplot as plt

plt.style.use('ggplot')
np.random.seed(42)

scale1 = 1
scale2 = 5
samples1 = np.random.exponential(scale1, 500)
samples2 = np.random.exponential(scale2, 500)
all_samples = np.concatenate([samples1, samples2])

with pm.Model() as model:
    scale = pm.HalfNormal("scale", 10)
    obs = pm.Exponential('obs', scale=scale, observed=all_samples)
    trace = pm.sample()

Scale posterior:

pm.plot_posterior(trace, var_names=["scale"])

The true scales are 1 and 5, while the model is quite sure that the scale is between 2.5-3. To me this is perfectly fine, the model claim is not absolute but relative, that is, it’s conditional on the parametrisation being right. “If this is the data generating process, then the scale parameter must be 2.5-3.2”. The core of my question has to do with reframing the condition that bases this statement (spoiler: I want probabilities conditional on my overall beliefs, not just on the model assumptions). This is more easily seen looking at the posterior predictive.

with model:
    posterior = pm.sample_posterior_predictive(trace)

ax = pm.plot_ppc(posterior, kind="cumulative", num_pp_samples=100)
cdf_at_10 = (posterior.observed_data.sortby("obs") < 10).mean()
ax.axvline(10, linestyle="--")
ax = ax.axhline(cdf_at_10.to_array().item(), linestyle="--")

Let’s say I’m concerned with the probability of delays less than 10 (let’s say minutes). Once again, the model is quite confident the probability is between 0.94-0.96; when it’s actually 0.93. I see no problem with this, the probabilities are conditional on the model assumptions being right, which they are not.

However, I want to use the model in production nonetheless because it’s kind of useful. I’m not worried about predictions being biased, but I don’t want predictions to express overconfidence because they fail to take into account the inaccuracy introduced by simplification. I’m not interested in predictions based on just the validity of the model’s assumptions, but also on my belief with respect to such validity.

In summary, I don’t trust the model I made and I want to reflect it somehow in the predictions I get. I think this should imply wider credible intervals in predictions.

I think, historically, prudence, as one of the cardinal virtues, was the art of incorporating disbelief with respect to one’s own assumptions in one’s decisions. I’m looking for a way to make this more explicit within a Bayesian framework.

Regards,
Juan.

1 Like

If you want to stick to a Bayesian approach, you don’t. The posterior inferences are determined by the model and the data. Instead, when the model is too simple, you change the model. In the Bayesian world, you are looking for the literature on calibration (starting with Dawid and ending with Gneiting et al.).

By way of contrast, in ML, usually both the model and the algorithm to fit the model are in play. Rarely will an ML approach fit the actual model they specify—there’s almost always implicit priors (or what an ML researcher would call “inductive biases”). It sounds like what you’re looking for is what the ML folks call “conformal prediction.” That lets you take a miscalibrated ML output and try to calibrate it with empirical data.

Jessical Hullman has been blogging about calibration and conformal prediction on Andrew Gelman’s blog.

Thanks for your input Bob. I’ve made a small edit to my question because you’ve made me realise my use of the word calibration was imprecise/misleading.

I’m okay with probabilities not matching empirical data. As I see it, if there’s any value in priors, I would actually like probabilities to be miscalibrated (calibration now used in the “conformal prediction” sense).

What I’m concerned with is that I get predictive distributions that are too narrow (and this appraisal could be based just on domain expertise, not necessarily data) because they are conditional on a set of assumptions that I know don’t exactly hold. So I want to incorporate this knowledge to add uncertainty in the predictions.

The edit I made:
I want confidence in predictions to be well calibrated.
I don’t want predictions to express overconfidence because they fail to take into account the inaccuracy introduced by simplification.