I depends very much on how your particular model was defined, and if the parameters that were traced get their shape
from broadcasting some of the original training data (this latter situation will lead to this issue).
I’ll give you an example in which you can use the “old trace” to predict out of sample observations without having to use shared variables anywhere:
import numpy as np
import pymc3 as pm
from theano import tensor as tt
from matplotlib import pyplot as plt
def model_factory(x, y):
x = np.asarray(x)
y = np.asarray(y)
with pm.Model() as model:
beta = pm.Normal('beta', mu=0, sigma=1, shape=(2,))
x_ = x.flatten()
X = np.column_stack((np.ones_like(x_), x_))
# If you enclose mu in Deterministic it will fail as #3346
mu = tt.tensordot(X, beta, axes=[1, 0])
pm.Normal('obs', mu=mu, sigma=1., observed=y)
return model
BETA = np.array([1.4, -0.6])
train_x = np.random.randn(100) * 3
train_y = np.polyval(BETA[::-1], train_x)
hold_out_x = np.linspace(-10, 10, 51)
hold_out_y = np.polyval(BETA[::-1], hold_out_x)
# Perform the training to get the trace
with model_factory(train_x, train_y) as model:
train_trace = pm.sample()
pm.traceplot(train_trace)
pm.plot_posterior(train_trace)
ppc = pm.sample_posterior_predictive(train_trace)
plt.figure()
plt.plot(train_x, ppc['obs'].T, '.b', alpha=0.01)
plt.plot(train_x, train_y, '-or')
proxy_arts = [plt.Line2D([0], [0], marker='.', linestyle='', color='b'),
plt.Line2D([0], [0], marker='o', linestyle='-', color='r')]
plt.legend(handles=proxy_arts, labels=['prediction', 'hold out data'])
plt.title('Posterior predictive on the training set')
# Construct a new model with the held out data
with model_factory(hold_out_x, hold_out_y) as test_model:
ppc = pm.sample_posterior_predictive(train_trace)
plt.figure()
plt.plot(hold_out_x, ppc['obs'].T, '.b', alpha=0.01)
plt.plot(hold_out_x, hold_out_y, '-or')
proxy_arts = [plt.Line2D([0], [0], marker='.', linestyle='', color='b'),
plt.Line2D([0], [0], marker='o', linestyle='-', color='r')]
plt.legend(handles=proxy_arts, labels=['prediction', 'hold out data'])
plt.title('Posterior predictive on the test set')
The trace plot looks like:
The plot posterior looks like:
The posterior predictive plot on the training data looks like this:
Finally, the posterior predictive plot on the held out data looks like this:
The only reason that this is possible is because none of the traced parameters (the variables that get sampled with pm.sample
) have their shape
determined by the training dataset. If for instance, you enclose mu
in a deterministic as mu = pm.Deterministic('mu', tt.tensordot(X, beta, axes=[1, 0]))
, then mu.shape
will depend on the original X
's shape (independently of whether X
is a shared tensor or not). This leads to the trace
to hold mu
values as a numpy array with a shape (and also values!) given by the training data, so when you try to sample the posterior predictive later on, you get a shape mismatch error.
If your model defines some parameters whose shape does not depend on the training data, except of course the observed variable, then you should be able to use the old train_trace
as input to the sample_posterior_predictive
of the model that was built with the held out data.