The 95% interval of PyMC3 trace doesn't cove the real values

Try something like this:

import numpy as np
import pandas as pd
import pymc3 as pm3
import matplotlib.pylab as plt
import seaborn as sb

%matplotlib inline

samples = pd.read_csv('~/Downloads/sample.csv', sep=',')

# some records may have missing values. So, skip them during the modelling
masked = samples[~samples.gc.isnull()]
num_ind = len(masked.idx.unique())
idx = np.asarray(masked.idx.astype(int))

with pm3.Model() as hierarchical_model:
    mu_intercept = pm3.Normal('mu_intercept', mu=0, sd=30)
    mu_d = pm3.Normal('mu_d', mu=0, sd=30)
    sigma_intercept = pm3.HalfCauchy('sigma_intercept', beta=2.5)
    sigma_d = pm3.HalfCauchy('sigma_d', beta=2.5)
    sigma_gc = pm3.HalfCauchy('sigma_gc', beta=2.5)
    sigma_p = pm3.HalfCauchy('sigma_p', beta=50)
    
    intercept_prior = pm3.Normal('intercept_prior_raw',
                                 mu=0,
                                 sd=1,
                                 shape=num_ind)
    intercept_prior = pm3.Deterministic(
        'intercept_prior', mu_intercept + sigma_intercept * intercept_prior)
    
    d_prior = pm3.Normal('d_prior_raw', mu=0, sd=1, shape=num_ind)
    d_prior = pm3.Deterministic(
        'd_prior', mu_d + sigma_d * d_prior)
    

    gc_prior = pm3.HalfNormal('gc_prior_raw',
                              sd=1,
                              shape=num_ind)
    gc_prior = pm3.Deterministic('gc_prior', sigma_gc * gc_prior)

    p_prior = pm3.Normal('p_prior_raw', mu=0, sd=1, shape=(5, num_ind))
    p_prior = pm3.Deterministic('p_prior', sigma_p * p_prior)

    y_est = (
        intercept_prior[idx] +
        d_prior[idx] * masked.d +
        gc_prior[idx] * masked.gc +
        p_prior[0, idx] * masked.p0 +
        p_prior[1, idx] * masked.p1 +
        p_prior[2, idx] * masked.p2 +
        p_prior[3, idx] * masked.p3 +
        p_prior[4, idx] * masked.p4
    )

    y_est = pm3.Deterministic('y_est', y_est)

    epsilon = pm3.HalfCauchy('epsilon', beta=2.5)
    pm3.StudentT('y_pred',
                 nu=7,
                 mu=y_est,
                 sd=epsilon,
                 observed=masked.y)

I mainly changed switched to a non-centered parametrization, and fixed a problem with epsilon. It runs with NUTS now (pm.sample(1000, njobs=4)), although there still is some trouble with the posterior I haven’t figured out. The it reaches max_treedepth a lot.

Don’t switch to metropolis if nuts doesn’t work well. If NUTS doesn’t work, then Metropolis most likely won’t either, it just hides this much better.

1 Like