I’m running into difficulties producing a generative model for a time series. I have a model as follows:
import pymc3 as pm
import theano.tensor as tt
cutoff_idx = 1000
y_obs = np.ma.MaskedArray(Y, np.arange(N) > cutoff_idx)
interval = 100
with pm.Model() as genModel:
sigma_mu = pm.HalfNormal('s_mu', sd=0.05)
sigma = pm.HalfNormal('s', sd=0.3)
mu = pm.GaussianRandomWalk('mu', mu=0.0, sd=sigma_mu, shape=interval)
weights = tt.repeat(mu, y_obs.size // interval)
y = pm.Normal('y', mu=weights, sd=sigma, observed=y_obs)
I want this model to learn the standard deviations sigma and sigma_mu, but not the individual mus in my gaussian random walk. This is because I want to be able to generate more data points (using the masked array) that simulate the underlying DGP, and I don’t want the model to learn a trend in the random walk (because I happen to know there is none – any such learning would be spurious).
In other words I noticed that, with the above code, if my generated data in Y happens to have data points mostly below 0, for example, then the mu variable will learn to generate mostly values below 0. This is not what I want: I’m using the random walk prior only to learn the variance of it, not any specific trend in it.
What would be the right way to achieve what I’m trying to achieve (assuming it’s possible)?