Trying to use sample_ppc
using a trace that does not contain all of the model’s RVs fails with “theano.gof.fg.MissingInputError: Undeclared input”.
Basically, I want to sample the posterior predictive distribution for a new dataset using a hierarchical model that I have sampled from on training dataset. So the top-level and group-level variables are present in the trace, but the latent variables for the individual units are not.
I am open to the argument that this is a misuse of sample_ppc
, since this isn’t really a check so much as an attempt at inference. The functionality for doing this is essentially the same as in sample_ppc
though; just drawing random values in depth-first order seems to fix the problem.
If this seems likely to be of sufficiently general interest, I’d be happy to send a pull request for the fix that I have in mind.
Failing example:
with pm.Model() as model:
a = pm.Gamma('a', mu=10.0, sd=2.0)
b = pm.Gamma('b', mu=a, sd=2.0)
trace = pm.sample(trace=[model.a, model.a_log__])
assert len(trace.varnames) == 2
c = pm.Gamma('c', mu=b, sd=1.0)
d = pm.Normal('d', mu=c, sd=a)
ppc = pm.sample_ppc(trace, 100, vars=[c,d]) #!!! will throw theano.gof.fg.MissingInputError