Initialzing NUTS with covariance matrix as mass matrix

Hello,

I want to initialize the mass matrix of the NUTS sampler with a covariance matrix calculated from the posterior samples of an MCMC chain that has converged. My small example is:

import pandas as pd
import numpy as np
import pymc3 as pm

#### Generate data
np.random.seed(123)
alpha, sigma = 1, 1 # True parameter values
size = 100 # Size of dataset

# Simulate outcome variable
Y = alpha + np.random.randn(size)*sigma

basic_model = pm.Model()
with basic_model:

   # Priors for unknown model parameters
   alpha = pm.Normal('alpha', mu=0, sd=10)
   sigma = pm.HalfNormal('sigma', sd=1)

   # Expected value of outcome
   mu = alpha

   # Likelihood (sampling distribution) of observations
   Y_obs = pm.Normal('Y_obs', mu=mu, sd=sigma, observed=Y)
   step1 = pm.NUTS(vars=[alpha,sigma], is_cov=True)
   trace = pm.sample(500, step=[step1], tune=1000, discard_tuned_samples=False)
creating the dataframe of posterior samples in the predefined order
df = pd.DataFrame()
for myname in trace.varnames:
  if myname in ["alpha", "sigma"]:
    k = pd.DataFrame(trace[500:][myname])
    print(myname)
    df = pd.merge(df, k, left_index=True, right_index=True, how='outer')

# var/cov matrix
covmatrix = np.cov(df.values)

then if I run my model again only with

step1 = pm.NUTS(vars=[alpha, sigma], is_cov=True, scaling=covmatrix)

Here in the sampling step I always get

“PositiveDefiniteError: Scaling is not positive definite: Simple check failed. Diagonal contains negatives”

Of course in this small and simple example tuning the NUTS would not be necessary. Nevertheless, it should still work, shouldn’t it?
It is possible to e.g. invert the calculated covariance matrix with numpy, so it is not singular. Thus, I don’t understand the error message. Can anyone help me to get this code functioning?

You can follow the code here:

The problem with pm.trace_cov() is that it creates a covariance matrix of all parameters. In case I am sampling say one parameter with a different sampler, I won’t be able to use it.
Thus, I am creating my own covariance matrix using np.cov() of only the relevant parameters.
Which order do the parameters have to be in?
The order of the trace? (the order of the columns when saving it as text backend)
or the order that is displyed as an output to the console, e.g. NUTS: [kappa, s, alpha, gamma, sigmasq] ?

Oh right I see, yeah that’s a tricky one, as sigma is a bounded random variable, and PyMC3 samples everything in continuous space, so you need to get the sigma after tranformed. Maybe try something like:

df = pd.DataFrame()
for myname in trace.varnames:
  if myname in ["alpha", "sigma_log__"]:
    k = pd.DataFrame(trace[500:][myname])
    print(myname)
    df = pd.merge(df, k, left_index=True, right_index=True, how='outer')

# var/cov matrix
covmatrix = np.cov(df.values)

@junpenglao I don’t think that is correct. You still don’t know the correct order.

You can use the dict -> array mapping from model.logp_dlogp_function([vars]).dict_to_array. That is the order NUTS uses internally.

1 Like

I assume this corresponds to the order in the list trace.varnames?

In your case you need to do model.logp_dlogp_function([alpha, sigma]).dict_to_array to get the correct mapping.

@junpenglao when I execute this code I get the error:

TypeError
Traceback (most recent call last)
in
----> 1 basic_model.logp_dlogp_function([alpha, sigma]).dict_to_array()

TypeError: dict_to_array() missing 1 required positional argument: ‘point’

What is the ‘point’ argument I have to pass?

it’s the fancy way of saying dict, it should have the same structure as point = basic_model.test_point

And by the way, you could also try https://github.com/aseyboldt/covadapt
I’m working on further improving and testing this right now, but even as it is it should work better than initializing with the empirical covariance.