Question regarding `gp.conditional` and `gp.prior`

I think I have a pretty basic question: When can I expect gp.conditional and gp.prior to agree when I run them both on the same data?

For example, the following model:

with pm.Model() as gp_model:
    # data containers
    X = pm.Data("X", item_times[:, None], shape=(None, 1))
    trials = pm.Data("trials", item_trials)
    sends = pm.Data("sends", item_sends)

    # priors
    ell1 = pm.HalfNormal("ell1", sigma=1.25)
    ell2 = pm.HalfNormal("ell2", sigma=1.25)
    eta1 = pm.HalfNormal("eta1", sigma=1.0)
    eta2 = pm.HalfNormal("eta2", sigma=1.0)

    # define the kernel
    cov = eta1 * pm.gp.cov.Matern12(1, ell1) + eta2 * pm.gp.cov.Periodic(1, 12, ls=ell2)

    gp = pm.gp.Latent(cov_func=cov)
    f = gp.prior("f", X=X)

    # logit link and Binomial likelihood
    lik = pm.Binomial("lik", n=trials, logit_p=f, observed=sends)

is sampled with nutpie:

compiled_gp_model = nutpie.compile_pymc_model(gp_model)
idata = nutpie.sample(
    compiled_gp_model, chains=2
)

and then we extend it with

with gp_model:
    f_pred = gp.conditional("f_pred", item_times[:, None])
    idata.extend(
        pm.sample_posterior_predictive(
            idata,
            var_names=["f_pred"],
        )
    )

and I would (naively) expect that the idata.posterior.f and idata.posterior_predictive.f_pred would “agree” in some sense, but they are wildly different.

For some example data this is the plot of both:

Am I misunderstanding what is happening?

I think this is because you fit the model with compiled_gp_model and then are generating conditionals with gp_model, which would explain why the conditionals look like they are coming from a prior. Try fitting the model with:

with gp_model:
    idata = pm.sample(chains=2, nuts_sampler="nutpie")

And see if things look any better.

1 Like

Yup! That’s it. Using gp_model in the loop gives:

Huh. I guess I cannot use a compiled model then?

and THANK YOU

You can use nutpie for posterior sampling, and then use the trace with the standard PyMC model for sampe_posterior_predictive, but perhaps nutpie is not returning all the info that sample_posterior_predictive needs (constant/observed data?)

Or you’re not updating the data on the regular pymc model?

So by “updating the data on the regular pymc model” do you mean supplying more arguments to sample_posterior_predictive?

Or by doing a pm.set_data() operation for all of the inputs?

I will try to replicate a simple example to see if I can get it to work with the compiled nutpie model.

I mean’t set_data but it doesn’t seem to be involved in your case.

The compiled or not nutpie model is irrelevant, it’s just used to get the posterior trace. I suspect you see the same problem if you use the sample interface with sample(nuts_sampler="nutpie")?

This can happen because nutpie is not returning the same inferencedata that pymc default sampler does, probably missing something in constant_data or observed_data?