How to make predictions in production without "observed eaches"

Jordan_Howell · September 25, 2023, 2:14pm

Hello,

I’ve developed model I’d like to put into production however, I realize that when said model is in production, making OOS predictions fail because there is no observed data. It gives a shape mismatch error due to the training observed data used in the initial sampling.

How do I make predictions, without observed data, in production?

cluhmann · September 25, 2023, 5:49pm

Can you provide a minimal example illustrating how you are doing prediction and what error you are seeing?

Jordan_Howell · September 25, 2023, 6:08pm

Ah…of course. My apologies.

with pm.Model(coords=coords) as constant_model:    
    #Data that does not change
    cat_to_bl_map = pm.Data('cat_to_bl_map', cat_to_bl_idx, mutable=False)
    subcat_to_cat_map = pm.Data('subcat_to_cat_map', subcat_to_cat_idx, mutable=False)
    ic_to_subcat_map = pm.Data('ic_to_subcat_map', ic_to_subcat_idx, mutable=False)
    ic_to_item_map = pm.Data('ic_to_item_map', ic_to_item_idx, mutable = False)

    #Data that does change
    pm_loc_idx = pm.Data('loc_idx', location_idx, mutable = True)
    pm_item_idx = pm.Data('item_idx', item_idx, mutable=True)
    pm_time_idx = pm.Data('time_idx', time_idx, mutable=True)
    observed_eaches = pm.Data('observed_eaches', df.residuals, mutable=True)
    promo_ = pm.Data('promotion', promo_idx, mutable = True) 

    loc_intercept = pm.Normal('loc_intercept', mu = 0, sigma = 1, dims = ['location'])
    loc_bl = utility_functions.make_next_level_hierarchy_variable(name='loc_bl', mu=loc_intercept, alpha=2, beta=1, dims=['business_line', 'location'])
    loc_cat = utility_functions.make_next_level_hierarchy_variable(name='loc_cat', mu=loc_bl[cat_to_bl_map], alpha=2, beta=1, dims=['category', 'location'])
    loc_subcat = utility_functions.make_next_level_hierarchy_variable(name='loc_subcat', mu=loc_cat[subcat_to_cat_map], alpha=2, beta=1, dims=['subcategory', 'location'])
    loc_ic = utility_functions.make_next_level_hierarchy_variable(name='loc_ic', mu=loc_subcat[ic_to_subcat_map], alpha=2, beta=1, dims=['ic', 'location'])
    loc_item = utility_functions.make_next_level_hierarchy_variable(name='loc_item', mu=loc_ic[ic_to_item_map], alpha=2, beta=1, dims=['item', 'location'])

    promo_intercept = pm.Normal('promo_intercept', mu = 0, sigma = 1)
    promo_bl = utility_functions.make_next_level_hierarchy_variable(name='promo_bl', mu=promo_intercept, alpha=2, beta=1, dims=['business_line'])
    promo_cat = utility_functions.make_next_level_hierarchy_variable(name='promo_cat', mu=promo_bl[cat_to_bl_map], alpha=2, beta=1, dims=['category'])
    promo_subcat = utility_functions.make_next_level_hierarchy_variable(name='promo_subcat', mu=promo_cat[subcat_to_cat_map], alpha=2, beta=1, dims=['subcategory'])
    promo_ic = utility_functions.make_next_level_hierarchy_variable(name='promo_ic', mu=promo_subcat[ic_to_subcat_map], alpha=2, beta=1, dims=['ic'])
    promo_item = utility_functions.make_next_level_hierarchy_variable(name='promo_item', mu=promo_ic[ic_to_item_map], alpha=2, beta=1, dims=['item'])

    mu = (loc_item[pm_item_idx, pm_loc_idx] + promo_item[pm_item_idx])

    sigma = pm.HalfNormal('sigma', sigma=100)

    eaches = pm.Normal('predicted_eaches',
                            mu=mu,
                            sigma=sigma,
                            observed=observed_eaches)

Here is how I’m predicting and the error:

with constant_model:
    pm.set_data({
                'item_idx': test_item_idx,
                'loc_idx': test_location_idx,
                'promotion': test_promo_idx,
                'time_idx': test_time_idx,
                })
    test_ppc = pm.sample_posterior_predictive(trace = idata)

You’ll notice I don’t have observed_eaches inside the set_data function. They don’t exist and throws this error:

ValueError: size does not match the broadcast shape of the parameters. (12516,), (12516,), (27959,)

Size (27959,) comes from the original observed_eaches.

Should I develop and empty column of observed_eaches and feed that into the set_data parameter or is there a best-in-class way to do this in production?

cluhmann · September 25, 2023, 8:36pm

The issue isn’t that there is no data. These sorts of issues arise when there is a mismatch between the data you have supplied via pm.set_data() and the rest of the model. Here, it seems like the shape of your observed data observed_eaches and your now-modified indices are incompatible. Indeed, I’m not exactly sure what you are trying to do. Can we see what is in coords?

jessegrabowski · September 26, 2023, 8:06am

If you’re just doing out-of-sample prediction, you can pass an array of zeros to observed_eaches. The data won’t be used in the computation, but it will force the shapes to match.

The recommended way to handle this is to pass a shape argument to the observed RV, so that it automatically re-sizes to match the exogenous data. See here for a discussion. So you could do:

    eaches = pm.Normal('predicted_eaches',
                            mu=mu,
                            sigma=sigma,
                            observed=observed_eaches,
                            shape=pm_loc_idx.shape[0])

Any exogenous data container that you will be updating for out-of-sampling fitting will do, I picked pm_loc_idx at random.

Jordan_Howell · September 26, 2023, 9:48am

Perfect. Thanks. This will be my first bayesian model I put into production. Thanks for all the help over the last year.

Topic		Replies	Views
Issues when trying to do out of sample prediction v5	2	148	April 9, 2024
Unable to make OOS predictions using a simple model v5	0	358	March 8, 2023
Out of sample predict issue	6	555	June 20, 2023
How do I make future predictions in a time series model with out of sample data? version agnostic prediction	12	2566	April 28, 2023
Why were the observed values in the out-of-sample prediction the true values of the training set, rather than the true values of the test set? v5 modeling , arviz	5	169	July 26, 2024

How to make predictions in production without "observed eaches"

Related topics