Getting an input error when setting data to out of sample data

Hello,

I’m trying to set my data to an out of sample set and get the following error:

Error

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/aesara/link/basic.py in __set__(self, value)
    111                 self.storage[0] = self.type.filter_inplace(
--> 112                     value, self.storage[0], **kwargs
    113                 )

/opt/conda/lib/python3.7/site-packages/aesara/graph/type.py in filter_inplace(self, value, storage, strict, allow_downcast)
    129         """
--> 130         raise NotImplementedError()
    131 

NotImplementedError: 

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_34196/1650750725.py in <module>
     46                     'pvbv': test_promo_pvbv_idx,
     47                     'giftset': test_giftset_idx,
---> 48                     'month': test_month_idx})
     49         print("sampling test ppc...")
     50         test_ppc = pm.sample_posterior_predictive(idata)

/opt/conda/lib/python3.7/site-packages/pymc/model.py in set_data(new_data, model, coords)
   1877 
   1878     for variable_name, new_value in new_data.items():
-> 1879         model.set_data(variable_name, new_value, coords=coords)
   1880 
   1881 

/opt/conda/lib/python3.7/site-packages/pymc/model.py in set_data(self, name, values, coords)
   1319                 self._coords[dname] = tuple(new_coords)
   1320 
-> 1321         shared_object.set_value(values)
   1322 
   1323     def register_rv(

/opt/conda/lib/python3.7/site-packages/aesara/compile/sharedvalue.py in set_value(self, new_value, borrow)
    143             self.container.value = new_value
    144         else:
--> 145             self.container.value = copy.deepcopy(new_value)
    146 
    147     def get_test_value(self):

/opt/conda/lib/python3.7/site-packages/aesara/link/basic.py in __set__(self, value)
    113                 )
    114             except NotImplementedError:
--> 115                 self.storage[0] = self.type.filter(value, **kwargs)
    116 
    117         except Exception as e:

/opt/conda/lib/python3.7/site-packages/aesara/tensor/type.py in filter(self, data, strict, allow_downcast)
    187                             f'"function". Value: "{repr(data)}"'
    188                         )
--> 189                         raise TypeError(err_msg)
    190                 elif (
    191                     allow_downcast is None

TypeError: ('TensorType(int32, (None,)) cannot store a value of dtype float64 without risking loss of precision. If you do not mind this loss, you can: 1) explicitly cast your data to int32, or 2) set "allow_input_downcast=True" when calling "function". Value: "array([1.        , 1.02083333, 1.04166667, 1.0625    , 1.08333333,\n       1.10416667, 1.125     , 1.14583333, 1.16666667, 1.1875    ,\n       1.20833333, 1.22916667])"', 'Container name "t"')

Code:

 t_test = np.arange(1, 1+12/T, 1/T)
    print(t_test)
    #get test data
    

    #bring in new data
    with constant_model:
        test_time_idx, test_times = pd.factorize(df_test.index.get_level_values(0))
        test_month_idx, test_month = pd.factorize(df_test['month'])
        test_item_idx, test_items =  pd.factorize(df_test.index.get_level_values(1))
        test_location_idx, test_locations = pd.factorize(df_test.index.get_level_values(2))
        test_promo_idx, test_promo = pd.factorize(df_test['promo_status_metric_measure'])
        test_cann_idx, test_cannibalization = pd.factorize(df_test['cannibalized'])
        test_dc_idx, test_dc_discount = pd.factorize(df_test['promo_desc_dcdiscount'])
        test_free_fin_idx, test_free_fin = pd.factorize(df_test['promo_desc_freefinancing'])
        test_giftset_idx, test_giftset = pd.factorize(df_test['promo_desc_giftset'])
        test_promo_pvbv_idx, test_promo_pvbv = pd.factorize(df_test['promo_desc_pvbv'])
        pm.set_data({'loc_idx': test_location_idx,
                    'item_idx': test_item_idx,
                    'time_idx': test_time_idx,
                    'observed_eaches': df_test['residual'],
                    't': t_test,
                    'promotion': test_promo_idx,
                    'cannibalization': test_cann_idx,
                    'dc_discount':test_dc_idx,
                    'free_fin': test_free_fin_idx,
                    'pvbv': test_promo_pvbv_idx,
                    'giftset': test_giftset_idx,
                    'month': test_month_idx})
        print("sampling test ppc...")
        test_ppc = pm.sample_posterior_predictive(idata)

I’ve ran this time series problem before and have not seen this issue. I’m not sure why my t_test would be a problem as an array of floats when the original data this model was fit on had an array of floats for t.

Has anyone ran into this issue?

My initial thought is there is a bug somewhere along the pipeline that unpacks your dataframe to the pymc model. If it is possible to share your dataset, that might make it easier to locate so I can try your code locally.

Which object is the array([1. , 1.02083333, 1.04166667, 1.0625 , 1.08333333,\n 1.10416667, 1.125 , 1.14583333, 1.16666667, 1.1875 ,\n 1.20833333, 1.22916667])? I can’t quite tell from the error message but it looks like it’s pointing to test_month_idx which really should be integers.

1 Like

Thank you.

That array is t_test. test_time_idx are integers. If it helps, my model is:

with pm.Model(coords=coords) as constant_model:    
    #Data that does not change
    cat_to_bl_map = pm.Data('cat_to_bl_map', cat_to_bl_idx, mutable=False)
    subcat_to_cat_map = pm.Data('subcat_to_cat_map', subcat_to_cat_idx, mutable=False)
    ic_to_subcat_map = pm.Data('ic_to_subcat_map', ic_to_subcat_idx, mutable=False)
    ic_to_item_map = pm.Data('ic_to_item_map', ic_to_item_idx, mutable = False)
    
    #Data that does change
    pm_loc_idx = pm.Data('loc_idx', location_idx, mutable = True)
    pm_item_idx = pm.Data('item_idx', item_idx, mutable=True)
    pm_time_idx = pm.Data('time_idx', time_idx, mutable=True)
    observed_eaches = pm.Data('observed_eaches', df1.residual, mutable=True)
    t_ = pm.Data('t', time_idx, mutable = True)
    promo_ = pm.Data('promotion', promo_idx, mutable = True)
    cann_ = pm.Data('cannibalization', cann_idx, mutable = True)
    dc_discount_ = pm.Data('dc_discount', dc_idx, mutable = True)
    free_fin_ = pm.Data('free_fin', free_fin_idx, mutable = True)
    pvbv_ = pm.Data('pvbv', promo_pvbv_idx, mutable = True)
    giftset_ = pm.Data('giftset', giftset_idx, mutable = True)
    month_ = pm.Data('month', month_idx, mutable = True)

    #Random Variables
    mu_intercept = pm.Normal('mu_intercept', mu = 0, sigma = .5)
    bl_intercept = utility_functions.make_next_level_hierarchy_variable(name='bl_intercept', mu=mu_intercept, alpha=2, beta=1, dims=['business_line'])
    cat_intercept = utility_functions.make_next_level_hierarchy_variable(name='cat_intercept', mu=bl_intercept[cat_to_bl_map], alpha=2, beta=1,  dims=['category'])
    subcat_intercept = utility_functions.make_next_level_hierarchy_variable(name='subcat_intercept', mu=cat_intercept[subcat_to_cat_map],  alpha=2, beta=1, dims=['subcategory'])
    ic_intercept = utility_functions.make_next_level_hierarchy_variable(name='ic_intercept', mu=subcat_intercept[ic_to_subcat_map],  alpha=2, beta=1, dims=['ic'])
    item_intercept = utility_functions.make_next_level_hierarchy_variable(name='item_intercept', mu=ic_intercept[ic_to_item_map], alpha=2, beta=1,  dims=['item'])

    loc_intercept = pm.Normal('loc_intercept', mu = 0, sigma = .5, dims = ['location'])
    loc_bl = utility_functions.make_next_level_hierarchy_variable(name='loc_bl', mu=loc_intercept, alpha=2, beta=1, dims=['business_line', 'location'])
    loc_cat = utility_functions.make_next_level_hierarchy_variable(name='loc_cat', mu=loc_bl[cat_to_bl_map], alpha=2, beta=1, dims=['category', 'location'])
    loc_subcat = utility_functions.make_next_level_hierarchy_variable(name='loc_subcat', mu=loc_cat[subcat_to_cat_map], alpha=2, beta=1, dims=['subcategory', 'location'])
    loc_ic = utility_functions.make_next_level_hierarchy_variable(name='loc_ic', mu=loc_subcat[ic_to_subcat_map], alpha=2, beta=1, dims=['ic', 'location'])
    loc_item = utility_functions.make_next_level_hierarchy_variable(name='loc_item', mu=loc_ic[ic_to_item_map], alpha=2, beta=1, dims=['item', 'location'])

    promo_intercept = pm.Normal('promo_intercept', mu =0, sigma = .5)
    bl_promo = utility_functions.make_next_level_hierarchy_variable(name='bl_promo', mu=promo_intercept, alpha=2, beta=1, dims=['business_line'])
    cat_promo = utility_functions.make_next_level_hierarchy_variable(name='cat_promo', mu=bl_promo[cat_to_bl_map], alpha=2, beta=1,  dims=['category'])
    subcat_promo = utility_functions.make_next_level_hierarchy_variable(name='subcat_promo', mu=cat_promo[subcat_to_cat_map],  alpha=2, beta=1, dims=['subcategory'])
    ic_promo = utility_functions.make_next_level_hierarchy_variable(name='ic_promo', mu=subcat_promo[ic_to_subcat_map],  alpha=2, beta=1, dims=['ic'])
    item_promo = utility_functions.make_next_level_hierarchy_variable(name='item_promo', mu=ic_promo[ic_to_item_map], alpha=2, beta=1,  dims=['item'])
    
    mu_cann = pm.Normal('mu_cann', mu = 0, sigma = .5)
    bl_cann = utility_functions.make_next_level_hierarchy_variable(name='bl_cann', mu=mu_cann, alpha=2, beta=1, dims=['business_line'])
    cat_cann = utility_functions.make_next_level_hierarchy_variable(name='cat_cann', mu=bl_cann[cat_to_bl_map], alpha=2, beta=1,  dims=['category'])
    subcat_cann = utility_functions.make_next_level_hierarchy_variable(name='subcat_cann', mu=cat_cann[subcat_to_cat_map],  alpha=2, beta=1, dims=['subcategory'])
    ic_cann = utility_functions.make_next_level_hierarchy_variable(name='ic_cann', mu=subcat_cann[ic_to_subcat_map],  alpha=2, beta=1, dims=['ic'])
    item_cann = utility_functions.make_next_level_hierarchy_variable(name='item_cann', mu=ic_cann[ic_to_item_map], alpha=2, beta=1,  dims=['item'])
    
    mu_dc_discount = pm.Normal('mu_dc_discount', mu = 0, sigma = .5)
    bl_dc_discount = utility_functions.make_next_level_hierarchy_variable(name='bl_dc_discount', mu=mu_dc_discount, alpha=2, beta=1, dims=['business_line'])
    cat_dc_discount = utility_functions.make_next_level_hierarchy_variable(name='cat_dc_discount', mu=bl_dc_discount[cat_to_bl_map], alpha=2, beta=1,  dims=['category'])
    subcat_dc_discount = utility_functions.make_next_level_hierarchy_variable(name='subcat_dc_discount', mu=cat_dc_discount[subcat_to_cat_map],  alpha=2, beta=1, dims=['subcategory'])
    ic_dc_discount = utility_functions.make_next_level_hierarchy_variable(name='ic_dc_discount', mu=subcat_dc_discount[ic_to_subcat_map],  alpha=2, beta=1, dims=['ic'])
    item_dc_discount = utility_functions.make_next_level_hierarchy_variable(name='item_dc_discount', mu=ic_dc_discount[ic_to_item_map], alpha=2, beta=1,  dims=['item'])
    
    mu_free_fin = pm.Normal('mu_free_fin', mu = 0, sigma = .5)
    bl_free_fin = utility_functions.make_next_level_hierarchy_variable(name='bl_free_fin', mu=mu_free_fin, alpha=2, beta=1, dims=['business_line'])
    cat_free_fin = utility_functions.make_next_level_hierarchy_variable(name='cat_free_fin', mu=bl_free_fin[cat_to_bl_map], alpha=2, beta=1,  dims=['category'])
    subcat_free_fin = utility_functions.make_next_level_hierarchy_variable(name='subcat_free_fin', mu=cat_free_fin[subcat_to_cat_map],  alpha=2, beta=1, dims=['subcategory'])
    ic_free_fin = utility_functions.make_next_level_hierarchy_variable(name='ic_free_fin', mu=subcat_free_fin[ic_to_subcat_map],  alpha=2, beta=1, dims=['ic'])
    item_free_fin = utility_functions.make_next_level_hierarchy_variable(name='item_free_fin', mu=ic_free_fin[ic_to_item_map], alpha=2, beta=1,  dims=['item'])
    
    mu_pvbv = pm.Normal('mu_pvbv', mu = 0, sigma = .5)
    bl_pvbv = utility_functions.make_next_level_hierarchy_variable(name='bl_pvbv', mu=mu_pvbv, alpha=2, beta=1, dims=['business_line'])
    cat_pvbv = utility_functions.make_next_level_hierarchy_variable(name='cat_pvbv', mu=bl_pvbv[cat_to_bl_map], alpha=2, beta=1,  dims=['category'])
    subcat_pvbv = utility_functions.make_next_level_hierarchy_variable(name='subcat_pvbv', mu=cat_pvbv[subcat_to_cat_map],  alpha=2, beta=1, dims=['subcategory'])
    ic_pvbv = utility_functions.make_next_level_hierarchy_variable(name='ic_pvbv', mu=subcat_pvbv[ic_to_subcat_map],  alpha=2, beta=1, dims=['ic'])
    item_pvbv = utility_functions.make_next_level_hierarchy_variable(name='item_pvbv', mu=ic_pvbv[ic_to_item_map], alpha=2, beta=1,  dims=['item'])
    
    mu_giftset = pm.Normal('mu_giftset', mu = 0, sigma = .5)
    bl_giftset = utility_functions.make_next_level_hierarchy_variable(name='bl_giftset', mu=mu_giftset, alpha=2, beta=1, dims=['business_line'])
    cat_giftset = utility_functions.make_next_level_hierarchy_variable(name='cat_giftset', mu=bl_giftset[cat_to_bl_map], alpha=2, beta=1,  dims=['category'])
    subcat_giftset = utility_functions.make_next_level_hierarchy_variable(name='subcat_giftset', mu=cat_giftset[subcat_to_cat_map],  alpha=2, beta=1, dims=['subcategory'])
    ic_giftset = utility_functions.make_next_level_hierarchy_variable(name='ic_giftset', mu=subcat_giftset[ic_to_subcat_map],  alpha=2, beta=1, dims=['ic'])
    item_giftset = utility_functions.make_next_level_hierarchy_variable(name='item_giftset', mu=ic_giftset[ic_to_item_map], alpha=2, beta=1,  dims=['item'])
        
    month_coeff = pm.Normal('month_coeff', mu = 0, sigma = .5)
    
    mu = (item_intercept[pm_item_idx]  * t_[pm_time_idx] + loc_item[pm_item_idx, pm_loc_idx] + item_promo[item_idx]*promo_ +
         item_cann[pm_item_idx]*cann_  + item_dc_discount[pm_item_idx]*dc_discount_ + item_free_fin[pm_item_idx]*free_fin_ + item_pvbv[pm_item_idx]*pvbv_ + 
         item_giftset[pm_item_idx]*giftset_ + month_coeff*month_)

    sigma = pm.HalfNormal('sigma', sigma=10)

    eaches = pm.Normal('predicted_eaches',
                                mu=mu,
                                sigma=sigma,
                                # lower = 0,
                                observed=observed_eaches)

I’m not sure why it would work when I fit to when I switch it out with the OOS data. It’s just a lopped off data set from the original used to fit.

Yeah looks like this is where the problem arises. I assume time_idx are integers? So t now expects future instances of the data to also be integers. But when you pass in the t_test = np.arange(1, 1+12/T, 1/T), those are all floats.

1 Like

The original error says you trying to set data that is float64 when the original mutabledata was float32. Just make sure the original data is float64 or the new one is float32.

If the original data was supposed to be integer make sure it is so as well. You can always explicitly cast with numpy.asarray(x).astype(...)

1 Like