Oh, I should mention that some degree of “seeing double” when you look at the in-sample plots is correct. Here’s the same model in statsmodels:
import statsmodels.api as sm
mod = sm.tsa.UnobservedComponents(dat['log_y'], level=True, trend=True,
stochastic_level=True, stochastic_trend=True,
irregular=False)
res = mod.fit()
fig, ax = plt.subplots(figsize=(14,4), dpi=144)
res.predict()[1:].plot(ax=ax)
dat.log_y.iloc[1:].plot(ax=ax)
Why does this happen? Each blue value is the one-step ahead forecast from the last orange point, and the local level model only knows how to continue dynamics seen up until that point. So when the acceleration is positive, the one-step ahead is just where the last data point was, increased. When acceleration is negative, same. Or flat. But it can never guess that the data will reverse – it’s not in the models’ nature. So in this case where we have this spikey seasonal patter, the local level model will always be chasing the last value, and you’ll get this nice 1980’s 3D glasses effect.
I think your original plot was probably correct, but I won’t edit my post above because I still think working with dates is maddening and you should leave it to the program to do automatically for you.
Edit: A good sanity check if to plot the filtered values. These should line up perfectly. Back to the PyMC model:
fig, ax = plt.subplots(figsize=(14,4), dpi=144)
x_grid = dat.index
mu = post.filtered_posterior_observed.mean(dim=['chain', 'draw']).values
hdi = az.hdi(post.filtered_posterior_observed).filtered_posterior_observed
ax.plot(x_grid[1:], mu[1:], alpha=0.5, color='tab:blue')
ax.fill_between(x_grid[1:], *hdi.values.squeeze()[1:].T, alpha=0.25)
dat.log_y.iloc[1:].plot(ax=ax, alpha=0.5, color='tab:red')

