Best way to use posterior as a prior for another analysis?

Hello,

I am analyzing experimental data in a incremental way.
I’d like to use the posterior produced by one experimental data analysis as the prior for the next experiment. What is the best way to do in in Pymc3 ?

The posterior is under the form of the trace of a random variable. And I don’t (really) have any clue
about the type of law (It’s support is [0,+inf]), And I’m using low informative prior (in the form of a Potential) for the first experiment.

As far as I know Pymc3, I guess my options are :

  • DensityDist seems to be the solution but I’ll have to use some method to approximate the logp function from the trace. Any suggestions for the interpolation technique/compatible or density estimation library ?
  • Interpolated seems to also be a solution, and providing the interpolation… if a set of point is given as a base for the interpolation. Question : is a basic histogram (from numpy for instance), is enough?

Is there any other option, that I missed ?

Thanks for any information or hint you can share on the topic.

regards.
H.

1 Like

Have you seen this pymc example already? Updating Priors — PyMC example gallery

1 Like

No, It’s the first time I see that example.
It’s exactly what I needed, thanks a lot @ricardoV94.

Sorry for replying this post so late. It seems the https://docs.pymc.io/notebooks/updating_priors.html has expired. Is there any newer examples solving this kind of question? Thanks a lot!

1 Like

From the linked reference:

This example provides a very nice usage example for the Interpolated class, as we will see below. However, this might not be a good idea to do in practice, not only because KDEs are being used to compute pdf values for the posterior, but mostly because Interpolated distributions used as priors are unidimensional and uncorrelated. So even if a perfect fit marginally they don’t really incorporate all the information we have from the previous posterior into the model, especially when posterior variables are correlated.

(Which by the way makes me think: doesn’t “unidimensional” already imply “uncorrelated”?, wouldn’t “priors are unidimensional and therefore mutually independent” be a clearer way to put it? Am I missing something here?)

Following @OriolAbril post there doesn’t seem to be a nice way to do incremental updates for the more general case of multidimensional priors, unless there’s been progress in the last two years. As far as I know there are two reasons to attempt the incremental path: first it’s quite elegant; second it allows to distribute computation cost (which tends to be an issue in Bayesian modelling) over time. For the latter there’s this thing called “Amortized Bayesian Inference” which I barely know about but I figured I could leave a reference in case anyone is interested: BayesFlow — BayesFlow: Amortized Bayesian Inference (also I’m interested in others’ opinions about this or any other alternatives). For the former I haven’t found any solutions (which makes me sad :cry:) but hopefully someone comes up (or has already come up) with something.

There’s also prior_from_idata — pymc_experimental 0.1.4 documentation

But in general it’s a tradeoff between approximating the posterior analytically or starting from the prior with all the data

1 Like