Get out-of-sample posterior predictive for mean

flyaflya · August 28, 2023, 6:19pm

I am playing around with using BART to predict horse speed (y_pred) using some old horse racing data. Here is my model:

with pm.Model() as model_jockey:
    X = pm.MutableData("X",x_train)
    Y = y_data
    mu = pmb.BART("mu", X=X, Y=Y, m=10)  ## up this to at least ?100? for production
    sigma = pm.HalfNormal("sigma",sigma=0.25)
    y_pred = pm.StudentT("y_pred", mu=mu, sigma=sigma, nu=2, observed=Y, shape = X.shape[0])
    idata_jockey = pm.sample(random_seed=RANDOM_SEED) #, initvals = initial_values)

I am able to predict y_pred for out of sample data using this code:

with model_jockey:
    pm.set_data({"X": x_test})
    ppc2 = pm.sample_posterior_predictive(
        trace=idata_jockey, random_seed=RANDOM_SEED,
        extend_inferencedata=True, predictions = True
    )

But for the life of me, I cannot figure out how get out-of-sample predictions for “mu”. How can I get the “mu” values used to sample from the posterior predictive when generating the y_pred values in the PPC?

cluhmann · August 28, 2023, 6:27pm

Welcome!

Would this work (it might not, I’m doing this from memory)?

ppc2 = pm.sample_posterior_predictive(
        trace=idata_jockey, random_seed=RANDOM_SEED,
        extend_inferencedata=True, predictions = True,
        var_names=["mu"]
    )

flyaflya · August 28, 2023, 6:35pm

Almost. I got an error due to extend_inference I think. This works:

with model_jockey:
    pm.set_data({"X": x_test})
    ppc2 = pm.sample_posterior_predictive(
        trace=idata_jockey, var_names = ["mu"],
        random_seed=RANDOM_SEED,
        predictions = True
    )

I started reading this excellent documentation here (should have found it earlier): Out of model predictions with PyMC - PyMC Labs

Thanks for the help!!

Topic		Replies	Views
How to use sample_posterior_predictive for out-of-sample prediction with BART? v5	4	1576	October 26, 2022
Why were the observed values in the out-of-sample prediction the true values of the training set, rather than the true values of the test set? v5 modeling , arviz	5	163	July 26, 2024
Is using the BART's predict function a correct way to predict new data? version agnostic	12	1543	December 29, 2022
Plot variable PYMC BART v5	8	488	September 6, 2023
Making test set prediction with BART Questions	2	594	May 4, 2021

Get out-of-sample posterior predictive for mean

Related topics