Hi,
At https://pymc3.readthedocs.io/en/latest/advanced_theano.html a method is proposed how to predict values for unseen data. My question is if this method still is correct when applied to a model with data-dependent priors. As an example, I adjust the model shown in the documentation so that it includes a data-dependent prior:
x = np.random.randn(100)
y = x > 0
x_shared = theano.shared(x)
with pm.Model() as model:
coeff = pm.Normal('x', mu=x_shared.min(), sd=1)
logistic = pm.math.sigmoid(coeff * x_shared)
pm.Bernoulli('obs', p=logistic, observed=y)
# fit the model
trace = pm.sample()
# Switch out the observations and use `sample_ppc` to predict
x_shared.set_value([-1, 0, 1.])
post_pred = pm.sample_ppc(trace, samples=500)
In my thinking, the posterior samples generated this way are not correct anymore, since the trace was generated before setting the new data points. Am I missing anything here, or, if not, is there a way to generate posterior samples for the data correctly?
I’m quite new to the field, so I hope the question makes sense. Thanks in advance!