Improving NUTS: Experimental LowRank mass matrix adaptation

I wrote a small package with a new algorithm for mass matrix adaptation:


This should improve sampling performance on models with high correlations, even when the number of parameters is relatively large. There might be bugs and bad numerics, but I would welcome anyone brave enough to try it out and see if it improves sampling for their models. Feedback welcome :slight_smile:

3 Likes

Very nice.

In previous work (not in MCMC but just in approximating distributions) I’ve used the MultivariateNormalDiagPlusLowRank (hey, I didn’t name it…) from tensorflow-probability, which (cf here) uses a matrix perturbation like

diag(sig) + VV^T for a (n, k) matrix V with k << n.

In so doing; V is not unique, but there’s no longer the cost of ensuring orthogonal columns. For mass matrix adaptation is an orthogonal system UDU^T strictly required?

That won’t work unfortunately. We also need to draw MVN samples with the inverse of the mass matrix, and that seems tricky with that form. If you write it as D(VSV^T + I - VV^T)D that gets easy, because you can just invert D and S.
I don’t think it is that costly to make sure the columns are orthogonal. Finding the eigenvalues is pretty fast, even for large systems, and we only need to do it a couple of times.

1 Like

We should request a MultivariateNormalSingularValueDecomposed distribution from tensorflow_probability :wink:

1 Like

Maybe more like MultivariateNormalScaledOnlySomeOfTheEigenvaluesDecomposed :slight_smile:

3 Likes