The problem with just inserting the known values into the GRW is that you get that “jumping” you see in the prior (and the posterior). If that large innovation at t=200 is acceptable to you, rock on. That’s definitely going to be the easiest way.
My first thought was a Potential as well. It would just need to be a quadratic loss, like pm.Potential('knowledge_penalty', -a * (innovations[200] - midpoint) ** 2)
, where a
is a strength parameter you can play around with.
A more principle approach would require some more structure than just a GRW. My first thought was a GP prior, conditioned on the values you have prior information about. Something like this:
time_idx = np.arange(100)[:, None]
with pm.Model() as m:
sigma_alpha = pm.Exponential("sigma_alpha", 1)
eta_alpha = pm.Exponential('ell_alpha', 1)
alpha_cov = eta_alpha**2 * pm.gp.cov.Exponential(1, ls=sigma_alpha)
alpha_gp = pm.gp.Latent(cov_func=alpha_cov)
alpha_gp.prior('alpha_uncond', X=time_idx)
a50 = pm.Normal('a50', mu=1.2, sigma=0.1)
alpha = alpha_gp.conditional('alpha', Xnew=time_idx, given={'X':np.array([[50]]), 'f':a50})
1000 draws from alpha
:

I didn’t try sampling a posterior from it, but it will definitely struggle if the length of the time series is long.
Another option would be some kind of parametric function. You could ensure that a polynomial of order k >n
will pass through n
specified points by fixing n
of the coefficients. That would perform poorly on forecasting (all polynomial models do) but in-sample it might be reasonable.
Finally, you could solve an optimization problem. You have a transition equation: x_{t+1} = x_t + \epsilon, for T timesteps. Say you “know” x_{50} = \bar x, and you sample 100 i.i.d. epsilons (so they are also “known”). Now you have a square system of 99 unknows x_0, x_1, \dots x_{49}, x_{51}, \dots and 99 equations, which is a square system with a unique solution trajectory. You can run an optimizer to get the unique trajectory that satisfies x_{50} and the epsilons, and plug it into your model.