What is tune in sampler?

Rahul_Deora · May 13, 2019, 5:53pm

What is tune in pm.sample(1000, tune=1000).
I understand 1000 samples are taken from the prior to estimate the posterior but what does tune do? Changing it to a low number messes everything up

chartl · May 13, 2019, 8:44pm

Hi @Rahul_Deora,

Take a quick peek at @colcarroll’s Series of posts on implementing Hamiltonian Monte Carlo . Post #3 covers this in detail:

One of the most immediate improvements you can make to Hamiltonian Monte Carlo (HMC) is to implement step size adaptation, which gives you fewer parameters to tune, and adds in the concept of “warmup” or “tuning” for your sampler.

Rahul_Deora · May 14, 2019, 4:39am

Can you explain briefly what that means? I’m not quite getting it. I am doing Statistical Rethinking and the pymc devs have used MCMC in place of quadratic approximation. The book explains MCMC much later ahead.

colcarroll · May 14, 2019, 3:23pm

Yes! A design goal of PyMC3 is to let the user worry about statistical modelling, and not worry about inference, and tuning attempts to automatically set some of the dozens of knobs available in modern MCMC methods.

As a basic, concrete example, Metropolis-Hastings MCMC starts at a point x, then draws x' from Normal(x, sd), and does some math to accept or reject x': if it rejects x', you add x to your samples again.

So how do we choose sd for the proposal distribution? There are some papers that suggest Metropolis-Hastings is most efficient when you accept 23.4% of proposed samples, and it turns out that lowering step size increases the probability of accepting a proposal. PyMC3 will spend the first 500 steps increasing and decreasing the step size to try to find the best value of sd that will give you an acceptance rate of 23.4% (you can even set different acceptance rates).

The problem is that if you change the step size while sampling, you lose the guarantees that your samples (asymptotically) come from the target distribution, so you should typically discard these. Also, there is typically a lot more adaptation going on in those first steps than just step_size.

tl;dr: The first tune steps allow the PyMC3 developers to adjust parameters based on best practices and current research.

Rahul_Deora · May 15, 2019, 11:21am

So should tune = no of samples?

colcarroll · May 16, 2019, 1:42pm

Nope! They are two parameters set separately pm.sample(n_samples, tune=n_tune).

I think the default of 500 samples and 500 tuning samples is usually good, but more tuning can sometimes help for complicated geometries, and more samples can sometimes help if you are making careful estimates.

PrashantSaikia · September 16, 2022, 8:00pm

Kinda hard to do when it takes forever. My n_samples=10 and n_tune=3 took about an hour and a half to run, on a small dataset of ~3000 rows.

So, what would you suggest here? Or is it just me who is getting these insanely long run times (I’m on a MacBook Air M1)?

colcarroll · September 17, 2022, 12:15am

Hi! The short answer is that it may be difficult to sample.

There are certain posteriors that might frustrate gradient based samplers. Locally, NUTS will take a long time because it will take up to 1,024 steps, checking for a U-Turn, and will stop expanding whenever it encounters one. In case it is taking this long, It may be taking 1,024 steps on every iteration. This means

the sampler is not actually encountering a U-turn, meaning the draws will be more correlated than they could be, and
it takes a long time (1,024 log_prob evaluations, and 1,024 grad(log_prob) evaluations, more or less).

Sampling typically will get faster as tuning goes along (I think currently in pymc, a new mass matrix is used after 101 tuning draws), but if that takes prohibitively long, you might have to think about

an alternative strategy to summarize the posterior (optimization, VI, pen and paper),
changing some priors to more well behaved distributions that are reasonably informative (i.e., make everything normal or half-normal with scales that are like 10, instead of like 1e10),
changing the model structure to better capture how the data were generated

Sorry this isn’t an easy answer!

Topic		Replies	Views
Sampling speeding up version agnostic	5	337	April 10, 2024
Just using NUTS Questions	7	1126	May 5, 2018
Pm.sample Parameters and Optimization v5 sampling	3	214	June 25, 2024
Adaptation phases of PyMC3 HMC NUTS sampler Questions	5	1065	February 8, 2021
Reuse tuning for next sampling call Questions	9	2224	February 8, 2019

What is tune in sampler?

Related topics