# of ADVI samples before HMC; Sampling with correlated latent vars

DanWeitzenfeld · November 14, 2017, 10:11pm

I’m working on a time series model with many parameters. When I call pm.sample() with the default args, ADVI “converges” after about 20k steps, and then HMC runs at about 1.4 iterations/second. It looks like the callback for ADVI is

    cb = [
        pm.callbacks.CheckParametersConvergence(
            tolerance=1e-2, diff='absolute'),
        pm.callbacks.CheckParametersConvergence(
            tolerance=1e-2, diff='relative'),
    ]

When I use pm.ADVI() and a callback with a callback of pm.callbacks.CheckParametersConvergence(diff='absolute', tolerance=1e-6), ADVI will run for upwards of 400k samples. Loss improves slowly after 100k, but it still improves.
Three questions:

Is it a bad idea to let ADVI run for longer before switching to HMC? In other words, is there a good reason that the callback in init_nuts is what it is?
If the answer to 1. is ‘No, you should try running ADVI for longer to see if it speeds up your HMC’, is there an easier way to do this than what I’m planning to do, which is to mimic the code in this block?
I suspect HMC is slow because some of my latent variables are correlated. In my model, there are two possible ‘causes’ for some data points - think of a state space model where a point could be a jump in state vs an additive outlier. Are there best practices/recommended reading for this situation?

Thanks in advance.

junpenglao · November 15, 2017, 6:17am

The number of iteration in the default setting is enough for most of our test case. If you want to increase the iteration, you should mimic the code block from pm.sampling - currently there is no easy way to change it.

In general, ADVI initialization is good to some extent: the problem is that it underestimates the variance which just results in the subsequent NUTS with small step size. So it is hard to say for sure if running ADVI longer would improve NUTS.

If you have correlated latent variables, you should try to reparameterized the model. Maybe you can try modelling the additive outlier cases as a state as well?

DanWeitzenfeld · November 17, 2017, 2:54pm

Thanks for the reply!
Regarding re-parameterization: as far as I can tell from reading the literature on this problem (time series with both structural breaks and additive outliers), there’s no way around the correlated variables, and the solution is specialized sampling algorithms (e.g. this paper).

Topic		Replies	Views
ADVI start with initialization Questions	7	2086	September 23, 2017
Poor Accuracy of ADVI for Linear Regression Questions	12	3406	April 18, 2018
Hierarchical Model - Slow Sampling Questions	4	1187	March 26, 2020
Using ADVI's question Questions	5	550	December 3, 2018
ADVI result systematically different to NUTS Questions	2	643	January 29, 2020

# of ADVI samples before HMC; Sampling with correlated latent vars

Related topics