Question Regarding Predictively Oriented Posteriors

Hey everyone,
I’ve been reading a bit about predictively oriented (PrO) posteriors and trying to understand how they behave beyond the high-level motivation.

From what I understand, the key idea is that instead of concentrating on a single parameter value, the posterior is defined through the induced predictive distribution. As a result, unlike standard Bayesian posteriors, PrO posteriors only collapse to a point mass when the model is exactly well specified; otherwise they stabilise to a non-degenerate distribution, where the remaining spread reflects model misspecification rather than lack of data.

What I’m still unclear about is how this limiting object looks mathematically in practice. In particular, I’m not sure under what conditions the predictively optimal posterior is unique, or how its variance relates quantitatively to the degree of misspecification. It also isn’t obvious to me how sensitive this behaviour is to the choice of predictive divergence being optimised.

On the computational side, I’ve seen proposals to sample PrO posteriors using mean-field Langevin dynamics, but I’m trying to understand how closely the resulting particle system actually tracks the intended predictive objective, especially in higher-dimensional or misspecified settings.

I’d really appreciate pointers to references, toy examples, or existing implementations that helped others build intuition around these questions. Thanks!

One more thing I wanted to point out is that in the Project idea page of Gsoc where this is mentioned we are explicitly asked to interact with PyTensor, which is the backend used by PyMC, but the listed link redirects to a page which does not seem to be working (https://www.pymc.io/projects/docs/en/stable/projects/docs/en/v5.0.2/learn/core_notebooks/pymc_pytensor.html) It would be great if I get some similar sources to understand what was it directing towards

Hey @Vikram ! Interesting topic, I haven’t seen anything on it before so I can’t help you there. A corrected link to the pymc and pytensor notebook is here though.

@jessegrabowski Thanks for the link I noticed that the project page mentions potential mentors, but I don’t see their usernames listed here. Could you help tag Osvaldo Martin, Chris Fonnesbeck, and Yann McLatchie?

I think you got the gist of the topic. The PrO posterior is the distribution over parameters whose induced predictive distribution minimizes a chosen proper scoring rule. The limiting PrO posterior is unique if the scoring rule is strictly proper. Its variance reflects irreducible predictive uncertainty due to misspecification; the more the model cannot capture the data, the wider the spread. There are still some open questions about the properties of these objects (at least for me), and from my perspective, one goal (besides writing the code) of the project is to better understand some practical consequences and start thinking about good practices and recommendations.

1 Like

@aloctavodia are there any existing sources that go into this in more detail, I’d really appreciate pointers. In particular, I’m trying to understand whether there are papers, examples, or implementations that show how predictively oriented posteriors are actually constructed or approximated in practice. It would also help to know if there’s already an established way people approach this (for example within probabilistic programming frameworks), or if most of the work is still at the theoretical stage.

We are sailing into uncharted waters.

1 Like