At https://pymc3.readthedocs.io/en/latest/advanced_theano.html a method is proposed how to predict values for unseen data. My question is if this method still is correct when applied to a model with data-dependent priors. As an example, I adjust the model shown in the documentation so that it includes a data-dependent prior:
x = np.random.randn(100) y = x > 0 x_shared = theano.shared(x) with pm.Model() as model: coeff = pm.Normal('x', mu=x_shared.min(), sd=1) logistic = pm.math.sigmoid(coeff * x_shared) pm.Bernoulli('obs', p=logistic, observed=y) # fit the model trace = pm.sample() # Switch out the observations and use `sample_ppc` to predict x_shared.set_value([-1, 0, 1.]) post_pred = pm.sample_ppc(trace, samples=500)
In my thinking, the posterior samples generated this way are not correct anymore, since the trace was generated before setting the new data points. Am I missing anything here, or, if not, is there a way to generate posterior samples for the data correctly?
I’m quite new to the field, so I hope the question makes sense. Thanks in advance!