Why does Metropolis evaluate the model twice per step?

maxsch95 · June 10, 2021, 11:08am

Hi,

I was wondering why the Metropolis implementation of pymc3 performs the model forward calculation twice per step:
Running this simple example and investigating the output shows that “Executed” is printed twice per step.

import numpy as np
import pymc3 as pm
import theano.tensor as tt

from theano.compile.ops import as_op


@as_op(itypes=[tt.dscalar, tt.dscalar],
       otypes=[tt.dvector])
def test_dist(x, y):
    radius = (x - y)**3
    print("Executed")
    return np.array([radius])


with pm.Model() as test_model:

    a1 = pm.Normal("a1", mu=5, sigma=2)
    a2 = pm.Normal("a2", mu=10, sigma=2)

    a3 = test_dist(a1, a2)

    a3_obs = pm.Normal("a3_obs", mu=a3,
                       sigma=130, observed=10.0)

    step = pm.Metropolis([a1, a2], blocked=True)

    trace = pm.sample(10, tune=50, init=None, chains=1,
                      cores=1, step=step, return_inferencedata=False)

From my understanding, shouldn’t it only be computed once, and then compared to the previous result?
Or is this for some more advanced feature that goes beyond the basic Metropolis algorithm?

I also noticed that the model is evaluated once if I load an existing trace using pm.load_trace. Why is that needed?

Thanks for the help!

michaelosthege · June 10, 2021, 10:30pm

Hi @maxsch95 ,

there are some scenarios where caching could speed things up, but as soon as there blocked sampling (CompoundStep) involved, a caching would violate the detailed balance.
In your example there’s blocked=True, so in every iteration the CompoundStep step method iterates over all blocks (2 Metropolis steppers assigned to a1 or a2) and steps them separately, before concluding the compound/blocked step.

I also noticed that the model is evaluated once if I load an existing trace using pm.load_trace. Why is that needed?

I’m not familiar with that one and there’s a high chance that we break/remove it at some point. If you have good reasons against return_inferencedata=True please let us know so we can try to iron out the blockers

cheers

maxsch95 · June 11, 2021, 11:52am

Hi @michaelosthege ,

thank you for the quick response!
However, I am still a little puzzled and would like to ask some follow up questions for clarification.
First off, what I want is to perform Metropolis steps where all the parameters are updated at once instead of being updated sequentially one after another.
Could you elaborate on the detailed balance a little bit?
From my understand, at each step we compute the acceptance ratio

where in my case x={a1, a2} are the parameters of the previous step and x'={a1', a2'} are the new proposed parameters. P(x) is the likelihood and the priors and we already know P(x) from the previous step and need to compute P(x') which requires one forward calculation. Shouldn’t the ratios of the g’s take care of the detailed balance?

Also, you suggested that there will be two Metropolis steppers because I have two input variables a1 and a2 but I noticed the forward calculations is always performed twice, even if I have 3, or any arbitrary number really, of input parameters. If I omit the blocked keyword, I get a separate stepper for each parameter, but each stepper performs two forward calculations again.
Say I have 3 input parameters a1, a2 and a3, and an observed value a4. I will get 3 Metropolis steppers and 6 model evaluations (“Executed” printed to stdout 6 times) per step.

I am just trying to understand what is going on under the hood. I looked at the source code but it was a little overwhelming. So, would be great if you could elaborate a little bit.
Thank you!

michaelosthege · June 11, 2021, 1:40pm

This line creates a callable to compute the ratio: pymc3/metropolis.py at v3.11.2 · pymc-devs/pymc3 · GitHub

It is a function that takes both x' and x: pymc3/metropolis.py at v3.11.2 · pymc-devs/pymc3 · GitHub
and
pymc3/metropolis.py at v3.11.2 · pymc-devs/pymc3 · GitHub

That’s why you get two evaluations in every iteration already without blocked sampling.

maxsch95 · June 11, 2021, 2:25pm

Thanks again for the response!

Those links were really helpful. I now understand why it is computed twice. Does it need to be though?
logp1 in delta_logp (pymc3/metropolis.py at c1efb7ad51e34e6bd8b292ef226a7e191adbdb82 · pymc-devs/pymc3 · GitHub) obviously needs to be computed but logp0 should still be known from the previous step, right? That would mean we perform twice as many model evaluations as we actually need.

I realize this probably does not matter for most models, but my evaluation takes around 20 to 30 seconds and I want to perform as few of these calculations as possible.

michaelosthege · June 13, 2021, 9:46am

As long as you step all dimensions at the same time, you can do that. One way to implement it would be with a custom class that inherits from pm.Metropolis and overrides the __init__ and step() methods.
You can keep track of the logp with a float or list attribute, for example.
But keep in mind that stepping all dimensions at the same time can be very inefficient, particularly with many dimensions and correlations.

maxsch95 · June 14, 2021, 8:32am

Thank you, that’s great. I am not doubting that it is less efficient. Do you maybe have a reference where I could read up on that?

Maybe as a general remark, the popularity of Bayesian methods in many areas of physics is increasing rapidly. However, the evaluation of models in many of those areas takes non-negligible time and can easily become the bottleneck for the analysis. So, wherever possible, caching previous results to avoid the reevaluation would be great. I don’t know if this is at all possible, just wanted to leave this as a remark.
Thanks for the help!

Topic		Replies	Views
Compounds steps Questions	5	2177	March 15, 2018
Initial evaluation of model at starting point failed when running a mixture model Questions bug	2	1197	October 8, 2021
Super slow sampling Questions	0	696	August 8, 2018
Skip iteration pymc3 Questions	1	336	October 31, 2020
Error in the sampling process, 'sd' not taken in account in Metropolis.py Questions	8	1445	September 29, 2017

Why does Metropolis evaluate the model twice per step?

Related topics