Customize Proposal distribution in pm.Metropolis for multivariables

ktkmr2117 · July 16, 2024, 9:39am

Hi all!
I’m now working on customizing my Proposal distribution in pm.Metropolis for multivariables.
Below is a test (PyMC5).

with pm.Model() as Model:
    x = np.array([1, 2, 3, 4, 5])
    y = np.array([2.02, 2.96, 4.05, 4.99, 5.96])
    a = pm.Normal('a', mu = 5, sigma = 1)
    b = pm.Normal('b', mu = 5, sigma = 1)
    model = a * x + b
    step0 = pm.Metropolis()
    ypred = pm.Normal('ypred', mu = model, sigma = 0.01, observed = y)

    trace0 = pm.sample(draws = 1000, tune = 1000, step = step0, chains = 4, return_inferencedata = False)
    step = pm.Metropolis(S = pm.trace_cov(trace0)) 
    trace = pm.sample(draws = 1000, tune = 1000, step = step, chains = 4, return_inferencedata = False)
    trace_to_data = pm.to_inference_data(trace)

    print(az.summary(trace_to_data, round_to = 4))
    az.plot_trace(trace_to_data, combined = True)

The code is successful until “step = pm.Metropolis(S = pm.trace_cov(trace0))”,
but an error ocuurs during “trace = pm.sample(draws = 1000, tune = 1000, step = step, chains = 4, return_inferencedata = False)”, saying that “ValueError: cannot reshape array of size 2 into shape ()”

Does anyone solve this problem?

cluhmann · July 16, 2024, 1:03pm

Welcome.

Can I ask what you are trying to do in customizing the step here? Metropolis seems like a poor fit here.

ktkmr2117 · July 16, 2024, 1:33pm

Thanks for your reply!
I would like to implement the algorithm like Adaptive MCMC, which adopts the Metropolis-Hastings step with proposal distribution updated by the previous traces.
First of all, I run the simple MCMC with Metropolis, and obtain the trace named trace0.
Next, l calculate the covariance matrix of trace0.
This matrix then becomes the covariance matrix of the proposal distribution of the next MCMC trial, but it fails.

cowirihy · January 28, 2025, 11:19am

This is fairly old now, but if of interest still I got this working.

The matrix returned by pm.trace_cov relates to all the RVs and coordinates in the model, listed in order per model.free_RVs. So in this case (since a and b are both scalars) this is a [2x2] matrix.

But when using covariance matrices to define the proposal distributions I find that you have to define the covariance submatrix that relates to each RV in turn; you can’t define for all RVs and coordinates in the model all at once. This works:

import numpy as np
import pymc as pm
import arviz as az

# %%
with pm.Model() as Model:
    x = np.array([1, 2, 3, 4, 5])
    y = np.array([2.02, 2.96, 4.05, 4.99, 5.96])
    a = pm.Normal('a', mu = 5, sigma = 1)
    b = pm.Normal('b', mu = 5, sigma = 1)
    model = a * x + b
    ypred = pm.Normal('ypred', mu = model, sigma = 0.01, observed = y)

pm.model_to_graphviz(Model)

# %%
# initial run using Metropolis without special config
with Model:
    step0 = pm.Metropolis()
    trace0 = pm.sample(draws = 1000,
                       tune = 1000,
                       step = step0,
                       chains = 4,
                       cores = 1,
                       return_inferencedata = False,
                       )

# %%
with Model:
    az.plot_trace(pm.to_inference_data(trace0), combined = True)
    # looks pretty ugly...

# %% [markdown]
# Get covariance matrix from posterior samples from this initial run:

# %%
with Model:
    # covariance matrix of posterior samples
    trace_cov = pm.trace_cov(trace0)

trace_cov

# %%
# get submatrices for each RV
len_rv = [1,1] # number of coords for each RV
trace_cov_submtrxs = []
j = 0
for n in len_rv:
    k = j+n
    slc = slice(j,k)
    trace_cov_submtrxs.append(trace_cov[slc,slc])
    j += n
trace_cov_submtrxs

# %% [markdown]
# Use this to define proposal distribution for 2nd run:

# %%
with Model:
    steps = []
    for rv, cov in zip(Model.free_RVs,trace_cov_submtrxs):
        steps.append(pm.Metropolis(vars=[rv],S=cov))
        
    trace = pm.sample(draws = 1000,
                      tune = 1000,
                      step = steps,
                      chains = 4,
                      cores = 1,
                      return_inferencedata = False)

# %%
with Model:
    trace_to_data = pm.to_inference_data(trace)

print(az.summary(trace_to_data, round_to = 4))
az.plot_trace(trace_to_data, combined = True)

This runs clean, but the resulting posteriors even with this adaptive approach still look pretty ugly; this relates to Metropolis being a poor fit (see earlier comment in post); NUTS would perform much better for a simple linear model such as this (but perhaps this is just a toy model?)

Topic		Replies	Views
Define a user proposal distribution in Metropolis Questions	0	340	March 24, 2020
Metropolis proposal dist Questions	6	986	November 15, 2018
Using multi-multivariate PDFs when defining the model's likelihood and sampling step method Questions	4	1939	March 7, 2018
Adaptive MCMC in PyMC? Development	9	2419	May 15, 2018
Citation for basic Metropolis sampler Questions	2	806	October 2, 2018

Customize Proposal distribution in pm.Metropolis for multivariables

Related topics