I’m modeling data as a signal plus autoregessive noise- for example, this test data:
#test
nt = 500
noise = np.zeros((nt,))
# true stationarity:
true_theta = 0.3
# true standard deviation of the innovation:
true_sigma=1.0
for t in range(1, nt):
noise [t] = true_theta * noise [t - 1] + np.random.normal(loc=0, scale=true_sigma)
x=np.linspace(0,1,nt)
true_signal= 2*x**2
true_beta=.7
#The [synthetic] observations are the scaled signal + AR(1) noise
obs=true_beta*true_signal+noise
#The true signal is quadratic. What if we looked for a linear signal instead?
other_signal=x
I have two models: one that with the “true” signal
with pm.Model() as true_signal_model:
theta = pm.Uniform("theta", -1.0, 1.0)
# precision of the innovation term
tau = pm.Exponential("tau", 0.5)
# scale the quadratic "true" signal
beta=pm.Normal("beta",0,1)
signal=pm.Deterministic("forced_response",beta*true_signal)
likelihood_obs = pm.AR1("likelihood_obs", k=theta, tau_e=tau,observed=obs-signal)`Preformatted text`
goodtrace = pm.sample(return_inferencedata=True)
and one with the linear signal
with pm.Model() as bad_signal_model:
theta = pm.Uniform("theta", -1.0, 1.0)
# precision of the innovation term
tau = pm.Exponential("tau", 0.5)
# scale the linear signal
beta=pm.Normal("beta",0,1)
signal=pm.Deterministic("forced_response",beta*other_signal)
likelihood_obs = pm.AR1("likelihood_obs", k=theta, tau_e=tau,observed=obs-signal)
badtrace = pm.sample(return_inferencedata=True)
It strikes me that I should be able to use LOO or WAIC to test which of these models best captures structure in the observations. And if I were doing standard regressions obs \sim \mathcal{N}(\beta \cdot signal,\sigma) I’d have log likelihoods of dimension (4,1000, 500) and could use az.loo. But the shape of the resulting log likelihoods is (4,1000,1) because they represent a single draw from the distribution. Basically, how do you do LOO when you have a nontrivial covariance structure in your data? Any advice for how to do model comparison in this case? Thanks!