Ho to increase the number of effective samples using Metropolis

How can i increase the number of effective samples produced from Metropolis sampling? I have a simple model that attempts to reproduce a vector of random parameters, p, from synthesized “observed data”, yd. To synthesize yd, i take the exponential of p and add some noise. I’m hoping someone can advise how i might be able to increase the resulting number of effective samples. Currently, some of my parameter values have as few as 1 effective sample. Below is my model:

import numpy as npy
import matplotlib.pyplot as plt
import pymc3 as pm
from scipy import optimize
import os

# Initialize random number generator
npy.random.seed(123)
samples = 10

# True parameter values
preal = npy.random.randn(samples)
psig=0.1
pmu = npy.zeros(samples)

# Simulate outcome variable
merr = pmu + psig * npy.random.randn(samples)
yd = npy.exp(preal) + merr

simpleModel = pm.Model()
with simpleModel:
    p = pm.Normal('p', mu=0, sd=10, shape=samples)
    sigma = pm.HalfNormal('sigma', sd=psig)
    mu = npy.exp(p)
    y = pm.Normal('y', mu=mu, sd=sigma, observed=yd)
    step = pm.Metropolis()
    trace = pm.sample(8000, step=step, progressbar=True, chains=2, tune=2000, random_seed=12345)

print pm.summary(trace, varnames=['p'])

The corresponding output is shown below:

Sequential sampling (2 chains in 1 job)
CompoundStep
>Metropolis: [sigma_log__]
>Metropolis: [p]
100%|███████████████████████████████████| 10000/10000 [00:03<00:00, 2558.08it/s]
100%|███████████████████████████████████| 10000/10000 [00:03<00:00, 2941.53it/s]
The gelman-rubin statistic is larger than 1.4 for some parameters. The sampler did not converge.
The estimated number of effective samples is smaller than 200 for some parameters.
          mean        sd  mc_error   hpd_2.5  hpd_97.5   n_eff      Rhat
p__0 -2.941978  1.623800  0.162051 -5.113717 -0.861878     1.0  5.408710
p__1  0.990024  0.045858  0.001724  0.899093  1.095810   657.0  1.001995
p__2  0.371194  0.093431  0.006091  0.151662  0.531839   148.0  1.018909
p__3 -2.206578  0.495102  0.048346 -3.149443 -1.257777     6.0  1.143571
p__4 -0.698358  0.249800  0.023039 -1.213951 -0.233303    36.0  1.000034
p__5  1.643647  0.024520  0.000559  1.594932  1.696914  1775.0  1.000475
p__6 -1.401166  0.673679  0.066484 -3.109014 -0.525631     6.0  1.047330
p__7 -0.164190  0.139116  0.010004 -0.460551  0.116755   119.0  1.007943
p__8  1.290933  0.034820  0.000887  1.213698  1.361767  1289.0  1.001282
p__9 -0.779513  0.232764  0.020756 -1.251856 -0.288337    18.0  1.069932

Thanks in advance for any assistance.

1 Like

Is there a particular reason you want to use Metropolis? All the parameters are continuous you can get much more efficient samples using the default (NUTS).

1 Like

The yd = exp(p) + noise relationship I’ve used above, with only continuous variables as you mention, was just a practice run for my more complicated model. I’m attempting to use PyMC3 for a statistical inversion problem where the forward model is a PDE solved via esys-escript using FEM. So the output from the forward model, after some required post-processing, is a vector of quantities. Likewise, the forward model only accepts a vector of parameters, p, as input.

In light of this, I have just assumed Metropolis to be the only suitable sampling method, please correct me if i am wrong. Do you think it would be possible to use NUTS? Otherwise, would you have any advice on sampling via Metropolis more effectively? Or perhaps an alternative sampling method?

Cheers,
Nick

The problem of Metropolis is that in high dimension the random walk is nearly impossible to explore the posterior space properly in finite time. For a complex system like PDE, I would suggest you to study the posts by @aseyboldt (see more details here Using pymc3 NUTS sampler for an "external model" having its derivatives with respect to its parameters). In many cases, gradient could be approximated which allow you to use NUTS.

1 Like

Thank you for that additional information Junpenglao. I’m using parameter and state space reduction techniques to combat the difficulties of random walk sampling in high dimensional space (refer C. Lieberman et al paper https://acdl.mit.edu/LiebermanSISC2010.pdf). Therefore, Metropolis would only be sampling from 10 - 15 parameters at most (likely less). I understand this is within the realm of possibilities using Metropolis. I will read up on the post you linked to above also.

Cheers,
Nick

If you know exactly what you are doing, then you can design better random walk kernel to improve Metropolis. One way is to assign different sampler to different parameters, for example assigning a slide sampler to some parameters.

It will be easier to discuss these once you have your code and data set up.

1 Like