In this very simple state-space model, the traces for the state_innovation_sd
and observation_sd
are understandably negatively correlated:
import pandas as pd
import pymc3 as pm
import numpy as np
state_innovation_sd = 1
observation_sd = .3
n_obs = 3000
obs = [0]
last_state = 0
for i in range(n_obs - 1):
next_state = last_state + np.random.normal(scale=state_innovation_sd, size=1)[0]
obs.append(next_state + np.random.normal(scale=observation_sd, size=1)[0])
last_state = next_state
df = pd.DataFrame({'obs': obs})
df.obs.plot()
with pm.Model() as model:
state_innovation_sd = pm.HalfNormal('state_innovation_sd', 2)
state = pm.GaussianRandomWalk('walk', sd=state_innovation_sd, shape=n_obs)
observation_sd = pm.HalfNormal('observation_sd', 2)
obs = pm.Normal('obs', mu=state, sd=observation_sd, observed=df.obs.values)
with model:
trace = pm.sample(chains=1)
with model:
pm.traceplot(trace, varnames=['state_innovation_sd', 'observation_sd'])
Can anyone recommend a way to reparameterize the model so that sampling is more efficient?
Thanks!
I don’t know how much reparametrization is going to help with the specific parameter sets you have chosen. When I roughly map out an approximate posterior for state_innovation_sd and observation_sd, I get a huge swath of area that has very high posterior probability.
# Mesh to evaluate posterior on
x = np.meshgrid(np.linspace(0.1, 2, 50), np.linspace(0.1, 2, 50))
x = np.array(x)
posterior = []
for i in range(50):
this_posterior = []
for j in range(50):
innovations = np.random.normal(loc = 0, scale = x[0, i, j], size = len(df.obs))
mean = np.cumsum(innovations)
this_posterior.append(np.sum(norm.logpdf(x = df.obs, loc = mean, scale = x[1, i, j])))
posterior.append(this_posterior)
posterior = np.array(posterior)
plt.pcolor(posterior)
plt.xticks(np.arange(0, 50, 5), np.around(np.linspace(0.1, 2, 10), decimals = 1))
plt.yticks(np.arange(0, 50, 5), np.around(np.linspace(0.1, 2, 10), decimals = 1))
plt.colorbar()
plt.xlabel("observation_sd")
plt.ylabel("state_innovation_sd")
So a whole wide range of values have high posterior probability, and it seems that it is really hard for the sampler to explore this region.
1 Like
One way to reparameterize this is to model the total variance each data point can have, and distributed the variance to innovation_sd and observation_sd. There is more detail in this doc:
http://docs.pymc.io/notebooks/GP-smoothing.html