Adding a new operator for an objective which involves gradients

I agree that normalising flows can be difficult to work with.

From preliminary tests a little while back, this approximation appears to be mode-seeking as well.

Sure, the most relevant references from my perspective are:
Huggins, Jonathan H., et al. “Practical bounds on the error of Bayesian posterior approximations: A nonasymptotic approach.” arXiv preprint arXiv:1809.09505 (2018).
and,
Hyvärinen, Aapo. “Estimation of non-normalized statistical models by score matching.” Journal of Machine Learning Research 6.Apr (2005): 695-709.

We have not published anything on using this for automated variational inference.