Hello,

I wrote-up my notes/notebook on practical use of ADVI, into a blog post that might be useful (primarily for ‘beginners’):

The GitHub repo with the notebook, is here:

Alex

Hello,

I wrote-up my notes/notebook on practical use of ADVI, into a blog post that might be useful (primarily for ‘beginners’):

The GitHub repo with the notebook, is here:

Alex

Hi, nice read. Some nitpicks:

that do

notrely heavily on computationally expensive random sampling

This needs to be rephrased. Is is more about exact full data gradient and stochastic one.

Simulations using full rank advi might be informative as well (posterior resembles gaussian with correlations)

Thanks for taking the time to read it - greatly appreciated.

I don’t understand - you’ll have to be more explicit.

The processes of ‘fitting’ a model is more computationally expensive for MCMC (random sampling), when compared to ADVI (zero or full-rank). I realise that for ADVI one has still to sample from the ‘fit’ before anything can be inferred, however - is this what you’re getting at?

Sorry for the late answer (notification problems from my side). The posterior distribution seemed to be correlated and fitting variatioal distribution, eg FullRankADVI might be beneficial. It will show the reader that we can have both: efficient fitting + good approximation.