Compounds steps

I am having trouble to understand how does the iteration are made with different compounds steps.

I’m running the model explained in:

And it’s using the NUTS sampler for the continuous variable and CategoricalGibbsMetropolis sampler for the two discrete variables.

The only explanation I could find in the documentation is:
“sampling proceeds by first applying step1 then step2 at each iteration.” I don’t really understand how the two sampler could be used at each iteration.
and an open issue on github that raises the issue that compound step in sampling is not explained in the documentation:

From what I can see when I plot the step_size_bar, I understand it as it runs all the iteration using the first sampler and then run all the iterations again using the second sampler:

n = aCH_.eval().shape[1]
with pm.Model() as basic_model:
    # Priors for unknown model parameters
    b1 = pm.Uniform('b1', lower=0.3, upper=0.5, testval=0.45)
    ncomp_aCH = pm.Categorical('ncomp_aCH', p=np.ones(n)/n)
    ncomp_aCOH = pm.Categorical('ncomp_aCOH', p=np.ones(n)/n)

    aCH=aCH_[0, ncomp_aCH]
    aCOH=aCOH_[0, ncomp_aCOH]


    out= b1*aCH+aCOH

    # Likelihood (sampling distribution) of observations
    Y_obs = pm.Normal('Y_obs', mu=out, tau=sigma, observed=Y)

    trace = pm.sample(2000000, progressbar=True)


plt.plot(trace['step_size_bar'])
plt.show()

Does anyone have any other informations on how does compound step sampling works?

1 Like

Yeah that part is not explained very well in the doc, that’s why I opened the issue there…

In brief, when Compound steps are involved, it takes a list of step to generate a list of methods. So for example if you do

with pm.Model() as m:
    rv1 = ...
    ...
    step1 = pm.Metropolis([rv1, rv2])
    step2 = pm.CategoricalGibbsMetropolis([rv3])
    trace = pm.sample(..., step=[step1, step2]...)

The Compound step is now contain a list of methods:

And at each sample, it iterates each method, which takes a point as input, and generates a new point as output. The new point is proposed within each step via a stochastic kernel, and if the proposal was rejected by MH criteria it just outputs the original input point

Take a simple example:

n_=theano.shared(np.asarray([10, 15]))
with pm.Model() as m:
    p = pm.Beta('p', 1., 1.)
    ni = pm.Bernoulli('ni', .5)
    k = pm.Binomial('k', p=p, n=n_[ni], observed=4)

Now specify the step:

with m:
    step1 = pm.Metropolis([m.free_RVs[0]])
    step2 = pm.BinaryGibbsMetropolis([ni])

And now you can pass a point to the step, and see what happens:

point = m.test_point
point
# {'ni': array(0), 'p_logodds__': array(0.)}

point, state = step1.step(point=point)
point, state
# ({'ni': array(0), 'p_logodds__': array(0.69089502)},
# [{'accept': 0.8832003265520174, 'tune': True}])

as you can see, the value of ni does not change, but p_logodds__ is updated.

And similarly, you can get a sample using the step2:

# (notice that there is no `generates_stats`, so only output the point here.)
point = step2.step(point=point)
point
# {'ni': array(0), 'p_logodds__': array(0.69089502)}

Compound step works exactly like this by iterating all the steps within the list. In effect, it is a metropolis hastings within gibbs sampling.

1 Like

I hope this clarifies a bit - it took me sometimes to understand the details as well.
Also, it would be a welcome contribution if you extend this into a doc :smile:

I can surely try. However this is not clear to me, but that’s probably because I focused on understanding the standard Metropolis-Hasting algorithm with a random walk proposal until now and I haven’t take the time to fully understand Gibbs sampling.

I’m going to take sometime to read some references on Gibbs and on NUTS and will come back to you either if I still don’t understand or if I am confident enough in my understanding to write a doc.

Thanks for your answer!

1 Like

Ok I think I got it.

So it’s basically Gibbs sampling where a sample from the conditionnal distribution is generated using in your example one iteration of: Metropolis for parameter p and Binary Gibbs Metropolis for ni ? But of course the Metropolis/Binary Gibbs metropolis can refuse the proposed new parameter and no change for that parameter occured.

When the steps are automatically attributed to parameters, how do we know in which order does the two steps take place? Is it a random order at each iteration or a fixed one?

Exactly. Notice that it is not exactly Gibbs sampling as it does not generate from a conditional probability. More precisely it updates in a Gibbs like fashion where the accept-reject is based on comparing the ratio of the conditional logp with p \sim \text{Uniform}(0, 1)

The order follows the same order of the RVs when it is assigned automatically. But if you specify the step you can change that order as well:

with m:
    comp_step1 = pm.CompoundStep([step1, step2])
comp_step1.methods

[<pymc3.step_methods.metropolis.Metropolis at 0x7fcfbeae7be0>,
 <pymc3.step_methods.metropolis.BinaryGibbsMetropolis at 0x7fcfad341eb8>]

with m:
    comp_step2 = pm.CompoundStep([step2, step1])
comp_step2.methods

[<pymc3.step_methods.metropolis.BinaryGibbsMetropolis at 0x7fcfad341eb8>,
 <pymc3.step_methods.metropolis.Metropolis at 0x7fcfbeae7be0>]

In the sampling, it always follows the same order per sample. It could be interesting to have random order implemented as well in the future.