I’m having trouble understanding what’s going when pm.sample_ppc is called. I made some data generated by a simple linear regression and then drew samples from the posterior predictive distribution with this code:
with pm.Model() as model1: x = pm.Uniform('x',lower = 0.0, upper = 10.0, observed = observed_x,shape = n) sigma = pm.HalfCauchy('sigma',beta = 1.0) beta = pm.Normal('beta',sd = 5.0) alpha = pm.Normal('alpha',sd = 5.0) y_hat = beta * x + alpha y = pm.Normal('y',mu = y_hat,sd = sigma, shape = n,observed = observed_y) trace = pm.sample() ppc = pm.sample_ppc(trace, samples=1, model=model1)
Then, I plot the samples of Y versus the samples of X
plt.scatter(ppc['x'],ppc['y']) and I find that they are uncorrelated:
Now, I was expecting it to instead look like this:
In short, my question is this: why are the sampled X and Y values not consistent with regard to the model logic?