Question generating posterior predictive samples with Latent GP

Hello all! I have a question about generating posterior predictive samples for out of sample values using a Latent GP and Binomial likelihood.

The data is like:
x = time points
z = number of trials of the Binomial at time x
y = number of successes at time x.

I split my data into train and test sets then:

My model is:

with pm.Model() as model:
    model.add_coord("obs_id", z_train.flatten(), mutable=True)

    ## Data containers to enable prediction
    trials = pm.MutableData("trials", z_train.flatten(), dims="obs_id")
    successes = pm.MutableData("successes", y_train.flatten(), dims="obs_id")
    t = pm.MutableData("t", x_train)

    # Seasonal component.
    ls = pm.Gamma(name='ls_1', alpha=2.0, beta=1.0)
    period = pm.Gamma(name='period', alpha=52.1429, beta=2)
    gp_seasonal =, period=period, ls=ls))

    # Linear trend.
    c_3 = pm.Normal(name='c_3', mu=1, sigma=2)
    gp_linear =, c=c_3))

    # Define gaussian process.
    gp = gp_linear + gp_seasonal
    logit_prob = gp.prior("logit_prob", X=t)

    # Likelihood.
    likelihood = pm.Binomial("likelihood", logit_p=logit_prob, n=trials, observed=successes)

Then we sample:

with model:
    # Sample.
    trace = pm.sample(draws=500, chains=2, tune=500, cores=1) 

I have to do cores=1 because I am on a Mac M1.

Now I want to do posterior predictive sampling on the test data. (and for the training data to check in-sample fit).

I tried:

with model:
            "t": x_train,
            "trials": z_train.flatten(),
            "successes": y_train.flatten()
        coords={"obs_id": z_train.flatten()}
    y_train_pred_samples = pm.sample_posterior_predictive(

            "t": x_test,
            "trials": z_test.flatten(),
        coords={"obs_id": z_test.flatten()}
    y_test_pred_samples = pm.sample_posterior_predictive(

but I get an error because the length of y_train is the old obs_id and not the length of y_test.

What is the correct format to do OOS posterior predictions for this type of model?


If I am not wrong you need to expand/recreate your model with .conditional as in Gaussian Processes: Latent Variable Implementation — PyMC example gallery unless you use the HSGP approximation, in which case the set_data approach suffices.


Thanks! That example very closely aligns to what I need.