PyMC3 posterior prediction example; how to incorporate a matrix of betas?

Thanks for the info; definitely will!

One last question if you might entertain me with is on predictions:

If I want to try to add some out-of-sample data, I am running into dimension issues:

# Predictions
new_data = np.array([0.25, 1.25, 0.33])

with sk_lm:
    pm.set_data({ "treatment_data": new_data[:, None] })
    predictions = pm.fast_sample_posterior_predictive(lm_idata)
    res = az.from_pymc3_predictions(predictions,
                              idata_orig=lm_idata,
                              inplace=False)

Error translating constant_data variable treatment_data: conflicting sizes for dimension 'obs_id': length 3 on the data but length 100 on coordinate 'obs_id'

1 Like

The issue now is that from_pymc3_predictions takes the coordinates from the model, which have not been updated and therefore still have the original obs_id length. You have to provide new coordinates, something like as.from_pymc3_predictions(..., coords={"obs_id": [1, 2, 3]})

Ah thanks; maybe that is another reason I should not be separating my betas for now.

With your fix, the model is still not happy!

with sk_lm:
    pm.set_data({ "treatment_data": new_data[:, None] })
    predictions = pm.fast_sample_posterior_predictive(lm_idata)
    res = az.from_pymc3_predictions(predictions,
                                    coords={"obs_id": [1, 2, 3]},
                                    idata_orig=lm_idata,
                                    inplace=False)

ValueError: conflicting sizes for dimension 'obs_id': length 100 on the data but length 3 on coordinate 'obs_id'

You have to make sure that all the shapes work together. Things that come to mind and would be issues:

  • You are creating pm.Data but not using it later in the model, so updating the data won’t change anything in the model, only what ArviZ stored as constant_data
  • You are only updating only treatment and not covariates, those are added, so their shapes need to match for the model to work.
1 Like

Thanks! That did it. The problem was:

# specify model
with pm.Model(coords=coords) as sk_lm:
    # data
    feature_data = pm.Data('feature_data', features, dims=('obs_id', 'features'))

    # priors
    alpha = pm.Normal('alpha', mu=0, sigma=1)
    betas = pm.Normal('betas', mu=[40, 90, 50], sigma=5, dims='features')

    # model error
    sigma = pm.Exponential("sigma", lam=1)

    # matrix-dot products
    m1 = pm.math.matrix_dot(feature_data, betas)  # here

I was using the literal features data instead of the pm.Data object.