hi, first thanks for maintaing a nice ecosystem for modeling in Python.

We have a bunch of neuroscience models in PyMC3 & Stan, and in part due to the expensive likelihood evaluation for some of these models we started looking at simulation based inference techniques as implemetned in the sbi package by Macke lab [1] among others. Having now read some of the papers there, I realized the approach is very close to variational inference except that instead of optimizing variational parameters directly, one first draws many prior predictive samples, chooses a summary statistic and trains a neural network to map summary statistics to variational parameters. I expect a lot of the ppl machinery already in pymc is applicable, and one would just need to implement the neural network aspects.

Admittedly I did not yet understand the connection to SMC which is already in PyMC.

Are there opinions positive or negative about this? If doing it by hand, I would first attempt the Gaussian mixture version [2] since it is easier to understand than the more recent NAF papers. It may be simpler to wrap up SBI itself, though that results in a depedence on a second tensor library.

SBI (or Approximate Bayesian Computation) is available. See this notebook. But if that doesnâ€™t fit your needs, let us know (and why it doesnâ€™t) and someone may be able to provide additional guidance.

thanks for the link! I did look, but (unless I misunderstood) thereâ€™s no approximate density when using ABC or SMC, one is generating samples from the posterior with some importance weights. Is this correct?

This contrasts with the techniques I am asking about which do have an approximate density which is trained on many prior predictive simulations. Perhaps the training could be seen as a role similar to sequential importance sampling, but the result is still an approximate posterior density, not a set of a samples. Does the distinction make more sense? I admit to not fully understanding SMC and Iâ€™ll try it out before pushing further

edit to further illustrate the difference as I see it: the result of training the approximate density is â€śamortized inferenceâ€ť: one can take out of sample observations and immediately generate the corresponding approximate density without the sampling step.

Correct. In recent years there have been many techniques for trying to get ABC to be faster and many of them end up replacing pieces of the more traditional ABC algorithms. Some try to learn useful summary statistics during inference. Others try to learn a surrogate likelihood function that is easier to compute than your native â€śsimulatorâ€ť likelihood. These techniques are not currently implemented in pymc, but would likely be of great interest to those with complex/intractable likelihoods.

Ok, thanks for the clarification. Iâ€™ll try to translate the Gaussian mixture parameterization from the â€śfast e-free inferenceâ€ť paper to pymc with a MAP estimate and see what happens.

Feel free to ask here or on github if you have questions. I know that @aloctavodia and @ricardoV94 were heavily involved in the current pm.simulator implementation.

Just to supplement my previous comment. This and this are a couple of recent innovations that caught my eye. Neither uses a neural network per se, but are using analogues that are a bit more of a natural fit for a Bayesian approach (e.g., gaussian processes). This and this are trying to automatically generate summary statistics. But many are working on this problem and so the variety of approaches is quite broad.