Reparameterization of State-Space / Structural Time Series models

In this very simple state-space model, the traces for the state_innovation_sd and observation_sd are understandably negatively correlated:

import pandas as pd
import pymc3 as pm
import numpy as np 

state_innovation_sd = 1
observation_sd = .3
n_obs = 3000
obs = [0]
last_state = 0
for i in range(n_obs - 1):
    next_state = last_state + np.random.normal(scale=state_innovation_sd, size=1)[0]
    obs.append(next_state + np.random.normal(scale=observation_sd, size=1)[0])
    last_state = next_state
df = pd.DataFrame({'obs': obs})

with pm.Model() as model:
    state_innovation_sd = pm.HalfNormal('state_innovation_sd', 2)
    state = pm.GaussianRandomWalk('walk', sd=state_innovation_sd, shape=n_obs)
    observation_sd = pm.HalfNormal('observation_sd', 2)
    obs = pm.Normal('obs', mu=state, sd=observation_sd, observed=df.obs.values)

with model:
    trace = pm.sample(chains=1)

with model:
    pm.traceplot(trace, varnames=['state_innovation_sd', 'observation_sd'])

Can anyone recommend a way to reparameterize the model so that sampling is more efficient?

I don’t know how much reparametrization is going to help with the specific parameter sets you have chosen. When I roughly map out an approximate posterior for state_innovation_sd and observation_sd, I get a huge swath of area that has very high posterior probability.

# Mesh to evaluate posterior on
x = np.meshgrid(np.linspace(0.1, 2, 50), np.linspace(0.1, 2, 50))
x = np.array(x)

posterior = []
for i in range(50):
     this_posterior = []
     for j in range(50):
         innovations = np.random.normal(loc = 0, scale = x[0, i, j], size = len(df.obs))
         mean = np.cumsum(innovations)
         this_posterior.append(np.sum(norm.logpdf(x = df.obs, loc = mean, scale = x[1, i, j])))
posterior = np.array(posterior)
plt.xticks(np.arange(0, 50, 5), np.around(np.linspace(0.1, 2, 10), decimals = 1))
plt.yticks(np.arange(0, 50, 5), np.around(np.linspace(0.1, 2, 10), decimals = 1))

So a whole wide range of values have high posterior probability, and it seems that it is really hard for the sampler to explore this region.

1 Like

One way to reparameterize this is to model the total variance each data point can have, and distributed the variance to innovation_sd and observation_sd. There is more detail in this doc: