Thanks for the response @ckrapu! I’m chiming in as author of the original work – @dlokhl and I are working together on this.
Here’s the paper introducing the idea (PDF). You got the gist right – we construct a density over \phi and sample from it using whatever off-the-shelf MCMC method you want. The final approximation is a mixture of the sampled $q$s. The density over variational parameters is \log q(\phi) = \frac{1}{2}\log | \mathcal{F}(\phi)| - \lambda \text{KL}(q||p), where \lambda is a hyperparameter, and |\mathcal{F}(\phi)| is the determinant of the fisher information matrix.
I think what we want to do to implement this in PyMC is to define a new Random Variable \phi whose logp function returns \log q(\phi) above. Would pm.Potential be the appropriate structure to use here? Any guidance is appreciated – we are new to pymc.