Customize Proposal distribution in pm.Metropolis for multivariables

Hi all!
I’m now working on customizing my Proposal distribution in pm.Metropolis for multivariables.
Below is a test (PyMC5).

with pm.Model() as Model:
    x = np.array([1, 2, 3, 4, 5])
    y = np.array([2.02, 2.96, 4.05, 4.99, 5.96])
    a = pm.Normal('a', mu = 5, sigma = 1)
    b = pm.Normal('b', mu = 5, sigma = 1)
    model = a * x + b
    step0 = pm.Metropolis()
    ypred = pm.Normal('ypred', mu = model, sigma = 0.01, observed = y)

    trace0 = pm.sample(draws = 1000, tune = 1000, step = step0, chains = 4, return_inferencedata = False)
    step = pm.Metropolis(S = pm.trace_cov(trace0)) 
    trace = pm.sample(draws = 1000, tune = 1000, step = step, chains = 4, return_inferencedata = False)
    trace_to_data = pm.to_inference_data(trace)

    print(az.summary(trace_to_data, round_to = 4))
    az.plot_trace(trace_to_data, combined = True)

The code is successful until “step = pm.Metropolis(S = pm.trace_cov(trace0))”,
but an error ocuurs during “trace = pm.sample(draws = 1000, tune = 1000, step = step, chains = 4, return_inferencedata = False)”, saying that “ValueError: cannot reshape array of size 2 into shape ()”

Does anyone solve this problem?

Welcome.

Can I ask what you are trying to do in customizing the step here? Metropolis seems like a poor fit here.

Thanks for your reply!
I would like to implement the algorithm like Adaptive MCMC, which adopts the Metropolis-Hastings step with proposal distribution updated by the previous traces.
First of all, I run the simple MCMC with Metropolis, and obtain the trace named trace0.
Next, l calculate the covariance matrix of trace0.
This matrix then becomes the covariance matrix of the proposal distribution of the next MCMC trial, but it fails.

This is fairly old now, but if of interest still I got this working.

The matrix returned by pm.trace_cov relates to all the RVs and coordinates in the model, listed in order per model.free_RVs. So in this case (since a and b are both scalars) this is a [2x2] matrix.

But when using covariance matrices to define the proposal distributions I find that you have to define the covariance submatrix that relates to each RV in turn; you can’t define for all RVs and coordinates in the model all at once. This works:

import numpy as np
import pymc as pm
import arviz as az

# %%
with pm.Model() as Model:
    x = np.array([1, 2, 3, 4, 5])
    y = np.array([2.02, 2.96, 4.05, 4.99, 5.96])
    a = pm.Normal('a', mu = 5, sigma = 1)
    b = pm.Normal('b', mu = 5, sigma = 1)
    model = a * x + b
    ypred = pm.Normal('ypred', mu = model, sigma = 0.01, observed = y)

pm.model_to_graphviz(Model)

# %%
# initial run using Metropolis without special config
with Model:
    step0 = pm.Metropolis()
    trace0 = pm.sample(draws = 1000,
                       tune = 1000,
                       step = step0,
                       chains = 4,
                       cores = 1,
                       return_inferencedata = False,
                       )

# %%
with Model:
    az.plot_trace(pm.to_inference_data(trace0), combined = True)
    # looks pretty ugly...

# %% [markdown]
# Get covariance matrix from posterior samples from this initial run:

# %%
with Model:
    # covariance matrix of posterior samples
    trace_cov = pm.trace_cov(trace0)

trace_cov

# %%
# get submatrices for each RV
len_rv = [1,1] # number of coords for each RV
trace_cov_submtrxs = []
j = 0
for n in len_rv:
    k = j+n
    slc = slice(j,k)
    trace_cov_submtrxs.append(trace_cov[slc,slc])
    j += n
trace_cov_submtrxs

# %% [markdown]
# Use this to define proposal distribution for 2nd run:

# %%
with Model:
    steps = []
    for rv, cov in zip(Model.free_RVs,trace_cov_submtrxs):
        steps.append(pm.Metropolis(vars=[rv],S=cov))
        
    trace = pm.sample(draws = 1000,
                      tune = 1000,
                      step = steps,
                      chains = 4,
                      cores = 1,
                      return_inferencedata = False)

# %%
with Model:
    trace_to_data = pm.to_inference_data(trace)

print(az.summary(trace_to_data, round_to = 4))
az.plot_trace(trace_to_data, combined = True)

This runs clean, but the resulting posteriors even with this adaptive approach still look pretty ugly; this relates to Metropolis being a poor fit (see earlier comment in post); NUTS would perform much better for a simple linear model such as this (but perhaps this is just a toy model?)