I am new to pymc3 and I have scanned through the list of outstanding/closed issues and couldn’t find an exact solution so I am posting it here. If there is any existing soln or written piece, I’d appreciate if you can link it to me.
The Problem
I am trying to sample from a mvGaussianRandomwalk with a switching component where the drift will change permanently after each time period with a probability p. The natural conclusion I came to was to use a geometric variable where its values represent the point of switching. However, using the metropolis sampler on newest theano (and stable release of pymc3), I find that part is not being properly sampled when I increase the size of observations that I have. See picture below
The Code
with pm.Model() as position_model:
eta = pm.distributions.continuous.Exponential('eta', lam=0.5, testval=0.5)
sd = pm.distributions.continuous.Exponential('sd', lam=0.5, shape=n, testval=0.5)
C_triu = pm.distributions.multivariate.LKJCorr('C_triu', eta=eta, n=n, testval=0.5)
C = pm.Deterministic('C', tt.fill_diagonal(C_triu[np.zeros((n, n), dtype=np.int32)], 1.))
sigma_diag = pm.Deterministic('sigma_mat', tt.nlinalg.diag(sd))
sigma = pm.Deterministic('cov', tt.nlinalg.matrix_dot(sigma_diag, C, sigma_diag))
mu_start = pm.distributions.continuous.Normal('start_drift', mu=5., shape=n, sd=10.)
mu_hold = pm.distributions.continuous.Normal('hold_drift', mu=0., sd=10., shape=n)
p = pm.distributions.continuous.Beta('p_hold', alpha=1, beta=1)
t_hold = pm.Geometric('time_until_hold', p=p, dtype='int64')
pos = []
for i in range(len(obs['CT'])):
px = obs['CT'][i]
obs_n = len(px)
tth = pm.math.minimum(t_hold, obs_n-1)
ht = obs_n-tth-1
drift = pm.math.concatenate([tt.tile(mu_start, (tth,1)), tt.tile(mu_hold, (ht,1))])
pos.append(pm.distributions.timeseries.MvGaussianRandomWalk('position_CT_rnd_%i' % i, mu=drift,
cov=sigma, init=inital_pos["CT"],
observed=px))
print 'Starting PyMC with %i samples' % len(pos) # Starting PyMC with 94 samples
start = pm.find_MAP(vars=position_model.cont_vars)
print start
step = pm.Metropolis()
trace = pm.sample(n_samples, step=step, start=start, tune=2000)
I believe the problem lies within this piece:
obs_n = len(px)
tth = pm.math.minimum(t_hold, obs_n-1)
ht = obs_n-tth-1
drift = pm.math.concatenate([tt.tile(mu_start, (tth,1)), tt.tile(mu_hold, (ht,1))])
Since each observation (a Tx2 vector) differs in size, restricting the geometric variable with min and max can produce intractable results. Under small samples (e.g range(5)), the geometric variable seems to be sampling well and produce different results each iteration but under the entire 94 samples, it seems to produce the same output. Also, let me know if my drift specification is correct
Things I’ve tried:
- Using different samplers such as NUTS (very slow, < 1 it/s), Metropolis (currently only one working), HamiltonMC (get typeError: expected type_num 11 (NPY_FLOAT32) but got 12).
- Using pm.math.switch for drift but it throws me a dimension mismatch error (my mu vectors are length 2)
- Using Bound for Geometric var (throws errors on negative array lengths).