Since in variational inference the purpose is to find an analytical approximation to some target distribution:
-
Once the ADVI fit has been performed via “.fit( …)” is it possible to use find_MAP(…) on this variational model?
…
I know if it was classical variational inference using a Gaussian mean-field assumption, it would be possible to find the mean of this variational distribution and therefore find the mode, which will be the MAP. This leads me to: -
In ADVI we are optimising in this transformed space, and the authors of ADVI then say: “these
implicitly induce non-Gaussian variational distributions in the original latent variable space”, so the final variational model need-not look Gaussian anymore. How different can it look in practice? Can the full-rank Gaussian ADVI model, perhaps develop two small peaks in the final converged answer (become slightly bi-modal? for example)