I followed this documentation on updating priors for sequential data for ex. time series.
This documentation showed only the training pipeline. But when I extended to prediction using post predictive sampling I found a bug.
In the training part, I modified model such that I need X’s shape of (3,100) and trained by this input.
But in predictive sampling, I fed X of shape (2,100), still model was predicting.
You can reproduce the error by running this notebook:
Weird, I’ll try to look into what is going on and get back to you
I found the source of the problem.
In the fourth cell, where you write:
basic_model = Model()
with basic_model:
# Priors for unknown model parameters
alpha = Normal('alpha', mu=0, sd=1)
beta0 = Normal('beta0', mu=12, sd=1)
beta1 = Normal('beta1', mu=18, sd=1)
beta2 = Normal('beta2', mu=15, sd=1)
# Expected value of outcome
mu = alpha + beta0 * x_all[0] + beta1 * x_all[1] + beta2*x_all[2]
# Likelihood (sampling distribution) of observations
Y_obs = Normal('Y_obs', mu=mu, sd=1, observed=Y)
# draw 1000 posterior samples
trace = sample(1000)
you are defining mu
in terms of x_all
and not of x
. The difference seems so subtle but it turns out to have huge consequences. x_all
is a np.ndarray
, and when it is multiplied and summed with theano.tensor
's, the are interpreted as TensorConstant
s! This means that their values can never be changed later on (maybe something could be done with theano.clone
), and the mu
tensor will always be computed using the initial x_all
. That is why, when you changed x and did
ppc = pm.sample_ppc(trace, samples=50, model=basic_model)
sample_ppc
could still sample even though the shapes were wrong.
To solve this problem you should just change the first definition of mu
to
mu = alpha + beta0 * x[0] + beta1 * x[1] + beta2*x[2]
As an unrelated side note, your last call to sample_ppc
uses the basic_model
, not the last updated model
, and I’m not sure if you did that on purpose or not. If you had used the updated model
, you would have not had the problem. In the eighth cell, where you define the updated model as
for _ in range(10):
# generate more data
X1 = np.random.randn(size)
X2 = np.random.randn(size) * 0.2
X3 = np.random.randn(size) * 0.3
Y = alpha_true + beta0_true * X1 + beta1_true * X2 + beta2_true * X3 + np.random.randn(size)
x_temp = np.array([X1,X2,X3])
x.set_value(x_temp)
model = Model()
with model:
# Priors are posteriors from previous iteration
alpha = from_posterior('alpha', trace['alpha'])
beta0 = from_posterior('beta0', trace['beta0'])
beta1 = from_posterior('beta1', trace['beta1'])
beta2 = from_posterior('beta2', trace['beta2'])
# Expected value of outcome
mu = alpha + beta0 * x[0] + beta1 * x[1] + beta2 * x[2]
# Likelihood (sampling distribution) of observations
Y_obs = Normal('Y_obs', mu=mu, sd=1, observed=Y)
# draw 10000 posterior samples
trace = sample(1000)
traces.append(trace)
you are setting mu = alpha + beta0 * x[0] + beta1 * x[1] + beta2 * x[2]
using the shared x
, and not x_temp
nor x_all
, so your later traces use correctly updated x
's, and sample_ppc
would complain about the inconsistent shape of the later x
.
Thanks a lot, @lucianopaz. Actually, that was a silly mistake and this was just a prototype of the code I was working. In my real code, I had taken values from theano tensor instead of numpy.ndarray
. But your suggestion helped me found the real bug, I had to reshape array of (21,) to (21,1) and that solved problem.
Also, I used the basic_model
because in the prediction I had to predict using index 0, 1 and 2 as constants while in new_model
that had a variable in for loop and dealing that was a tedious process. I used updates trace samples in the basic model so that is updated one.
Again, Thanks for the help!