How to interpret the outcome of adapt_full

julien · July 12, 2023, 6:28pm

Sometimes models work better with init=adapt_full sometimes adapt_diag is just better, and I don’t quite understand in which situations adapt_full is supposed to work better or what does it tell you when it does. I would also like to understand if you can somehow use some of the information of adapt_full, e.g. extract the mass matrix, and obtain from that some intuition on how to reparametrise your model to get adapt_diag to work better.

In the context of Quantum Field Theory, which is what the HMC was created for, I have always used the identity as a mass matrix. And this works. A dense mass matrix would anyway not be possible, even for a small space-time it would not even fit in the memory of any of our supercomputers. And this works despite our models always being very badly parametrised, in the sense that a large part of the game consists precisely in finding the actual degrees of freedom (protons and so on) emerging from the gluon&quark parametrisation, and that’s the whole reason why the HMC exists (otherwise a Taylor expansion would be enough).
Is this mass matrix tuning related to the No U-turn extra layer?

By the way, I have never understood if and how I could use pymc.HamiltonianMC to construct a good-old HMC with a fixed step size without all the fancy auto-tuning (even with a single-level integrator)

Concrety, my current problem is:
I have a model which I started to write with an increasing level of complexity. Basically the top layers of the graph are shared, and then the difference is whether at the end I have:

A normal likelihood
A multivariate normal likelihood
A VAR with multivariate normal innovations

And I am stucked after step 1. This samples in 10 minutes with adapt_full, lasts for days without end with anything adapt_diag based, and somewhat samples in 10 minutes with JAX/Numpyro but not as good as PyMC/adapt_full. I could be fine with that, but the problem is that I cannot use adapt_full for the second model: because of my covariance matrix it seems to be too many variables and then the mass matrix can just not fit in memory. So somehow I need adapt_diag to work on model 1.
My data is approximately 200(components)x200(samples), with the covariance acting in the first dimension so a 200x200 LKJ or WB prior. My data also has plenty of orders of magnitude (mostly decaying exponentially in the first dimension), which might not be ideal in terms of stability.

Topic		Replies	Views
QuadPotentialFullAdapt v5	7	424	July 2, 2022
Adaptation phases of PyMC3 HMC NUTS sampler Questions	5	1071	February 8, 2021
`jitter+adapt_diag` vs `adapt_diag` Questions	3	598	October 7, 2020
Hamiltonian MC with different variable dimensions Questions	3	858	July 23, 2019
What exactly is `jitter+adapt_diag` and why is it the default now? Questions	3	3433	November 3, 2017

How to interpret the outcome of adapt_full

Related topics