Hi! Could you please help me with this problem?
I’m using pmb.predict() now for making predictions, but it’s much slower compare to using pm.sample_posterior_predictive(). Is there a reason why?
And for now, I only know how to use pm.sample_posterior_predictive() for in-sample prediction, but does not know how to use it for out-of-sample prediction. Is there a way to use it? Or is there a way that I can make predictions faster?
The following code is my attempt to use pm.sample_posterior_predictive() for out-of-sample data, and I got an unexpected result.
with pm.Model() as model_2:
pred = pm.MutableData("pred", X_train)
σ = pm.HalfNormal("σ", y_train.std())
μ = pmb.BART("μ", pred.get_value(), y_train, m=50)
y = pm.Normal("y", μ, σ, observed=y_train)
idata_2 = pm.sample(random_seed=RANDOM_SEED)
with model_2:
pm.set_data({"pred": X_test})
idata_2 = pm.sample_posterior_predictive(
idata_2,
var_names=["μ", "σ"],
return_inferencedata=True,
predictions=True,
extend_inferencedata=True,
random_seed=rng,
)
My test set has 1250 data points, and my training set has 3750 data points. The prediction dimension is unexpected.