I noticed that in metropolis.py there is this code in the tune function:
# Switch statement
if acc_rate < 0.001:
# reduce by 90 percent
scale *= 0.1
elif acc_rate < 0.05:
# reduce by 50 percent
scale *= 0.5
elif acc_rate < 0.2:
# reduce by ten percent
scale *= 0.9
elif acc_rate > 0.95:
# increase by factor of ten
scale *= 10.0
elif acc_rate > 0.75:
# increase by double
scale *= 2.0
elif acc_rate > 0.5:
# increase by ten percent
scale *= 1.1
Where do these numbers come from (the acceptance rate thresholds, and the corresponding scaling factors)? I don’t think it is in the standard Metropolis algorithm but I could not find any references to these numbers. Is it some kind of empirical rule?
Am I correct in understanding that if I overload pm.Metropolis(scaling=10.0) for instance, this new scaling value is taken only into account at the beginning and then during the run it gets changed as the acceptance rate varies?
They are arbitrary. You can choose coarser or finer adaptation, but these work pretty well. I suppose we could allow for different mappings to be passed. Since the adaptation only happens in the tuning phase, it does not affect the validity (i.e. reversibility) of the Markov chain.
Thank you for the answer.
When you say they only happen during the tuning phase, you mean the ‘burn-in’ phase (length=500 by default) ?
After that the algorithm does not go through this part of the code anymore?
Also, are these arbitrary numbers based on a paper?
Thanks, but then does the scaling value at the last step of the burn-in remains in use, or does it get back to the default 1, or to the scaling value passed to pm.sample(Metropolis)?
And does the scaling value passed to pm.sample(Metropolis) gets used immediately at the beginning of the burn-in or not?