Centering data using pm.Deterministic

I am not sure if I am understanding your issue or not, but I guess you are trying to recenter data within the model according to the new data? BTW, you can still update the mean in your current approach by just using pm.set_data for both the data and mean value. Also, you don’t need to wrap the mean in a deterministic if you don’t need to track it.

This works w/o Deterministic:

y = pm.Data("y",  df["y"].to_numpy())  
y_c = y -  df["y"].mean()

If you want to center based on the mean of the new data then you can just do this:

y = pm.Data("y",  df["y"].to_numpy())  
y_c = y - y.mean()

Is it important for your case to recenter the data to the new mean? In most cases I think it’s best practice to center based on your original mean for out of sample prediction. Unless you are specifically trying to do out of model prediction and just reuse the parameters.

1 Like