I am trying to do this time series model where the error measured at time t depends on the error at time t-1:
$$ Y_{t} = a + b*cos(\pi/12(t-c)) + d * activity_t + \epsilon_t$$
$$ \epsilon_t \sim \mathcal{N} (\theta* \epsilon_{t-1}, \sigma^2)$$
I couldnt get my model to sample correctly. Could you please help me figure out where I got it wrong?
with pm.Model() as ar1_cos_steps:
# priors
a = pm.Normal('a', mu=0, sigma = 10.0) #Hr_base
b = pm.Normal('b', mu=1, sigma = 1.0) #amplitude
c = pm.Normal('c', mu=1, sigma = 1.0) #phase param
d = pm.Normal('d', mu=0, sigma = 10.0) #activity coefficient
# sigma = pm.HalfNormal("sigma", sigma=1)
theta = pm.Normal("theta", 0.0, 1.0) #error correlation terms
tau = pm.Exponential("tau", 0.5) # precision of the innovation term tau=1/sigma
# Gaussian noise
err_center = pm.Normal("center", mu=0.0, sigma=1.0)
# Expected value of outcome
mu_y = a + b * pm.math.cos(np.pi*(t-c)/(12*60/bin_size)) + d*activity
# noise term
err = y - mu_y
likelihood = pm.AR1("err", k=theta, tau_e=tau, observed= err - err_center)
trace_ar1 = pm.sample(3000, tune=2000, step = pm.Metropolis()) # metropolis sampler is a lot easier to work with
idata_ar1 = az.from_pymc3(trace_ar1)
I am coding up this model as an exercise from a Cell Reports paper that I found. Could you please help me understand why the authors wanted the errors to correlate (as they provided no explanation in their method section)? To me this is counterintuitive as why one would want their model to always underestimate or overestimate. Please forgive my little understanding of the topic. Thanks so much!