Is this half cauchy model correct?

jessegrabowski · April 3, 2023, 10:06am

The best practice for out of sample prediction is to:

Use pm.MutableData objects to hold your data
Include size=X.shape[0] in the likelihood term
Use pm.set_data together with pm.sample_posterior_predictive to get out-of-sample predictions.

Here is how it looks with your model:

with pm.Model() as model:
    # Here's the mutable data
    X_pt = pm.MutableData('X', X_train)
    y_pt = pm.MutableData('y', y_train)
    
    α = pm.Normal("α", mu=0, sigma=1)
    β = pm.Normal("β", mu=0, sigma=1)
    
    mu =  α + β * X_pt
    sigma = pm.HalfCauchy("sigma", beta=1)

    # The size=X_pt.shape[0] part is very important, or else you will get some errors later on
    obs = pm.Normal('obs', mu=mu, sigma=sigma, observed=y_pt, size=X_pt.shape[0])
    
with model:
    trace = pm.sample(1000, tune=1000)

    # In-sample prediction
    trace = pm.sample_posterior_predictive(trace, extend_inferencedata=True)
    
    # Out-of-sample prediction
    pm.set_data({'X':X_test})
    # predictions=True makes a new group in the InferenceData object, so your in-sample predictions (saved in "posterior_predictive") won't be overwritten.
    trace = pm.sample_posterior_predictive(trace, extend_inferencedata=True, predictions=True)

Plotting the results:

fig, ax = plt.subplots()

for [group, X_data] in zip(['posterior_predictive', 'predictions'], [X_train, X_test]):
    hdi = az.hdi(trace[group])
    idata = az.extract(trace, group)
    ax.plot(X_data, idata.obs.mean(dim=['sample']), label=group)
    ax.fill_between(X_data, hdi.obs.sel(hdi='lower'), hdi.obs.sel(hdi='higher'), alpha=0.25)
    
ax.plot(X, y_data_scaled, color='tab:red', ls='--')
ax.legend()
plt.show()

Topic		Replies	Views
Why were the observed values in the out-of-sample prediction the true values of the training set, rather than the true values of the test set? v5 modeling , arviz	5	163	July 26, 2024
How can i sample posteriors from training-data, run posterior predictive on testing data (with complex factor graph) v5 modeling	9	675	April 23, 2023
Sample_posterior_predicitve not catching shape of new data v5 prediction	10	1268	August 24, 2022
Pymc5 out of sample v5 modeling	9	1228	August 20, 2023
Out of sample prediction v5 modeling	20	1879	April 7, 2023

Is this half cauchy model correct?

Related topics