Variational fit (ADVI) - initialisation

@bob-carpenter thanks for highlighting those problems. I am also looking into using a deterministic variant of ADVI, cf. ENH: implement deterministic ADVI (DADVI) · Issue #7374 · pymc-devs/pymc · GitHub to hopefully mitigate some of the problems of stochastic gradiebr descent.

DADVI still uses a mean-field approach and until I have gotten around to implement it I will need to use the methods available.

I am looking at a sequence of hierarchical models with many groups (thousands) and hundred to thousands samples per group - atm HMC sampling is too slow so that I need to use VI in order to speed up inference.

In addition Models in the sequence feature an increasing number of groups, but unfortunately there doesn’t seem to be a computationally feasible way to use posterior samples from a previous model to speed up the HMC inference for the next model in the chain. With VI I can use the fitted approximation for the previous model as the initial guess for the next model, e.g. via transfer of the parameters of the mean field approximation. Does that make sense?