All my code is a mess too, don’t be shy about sharing. It looks like you have a good start!
- For setting the priors, in PyMC, rather than passing a scalar value as the mean of an effect, you can pass an array. Let’s say you have 5 cities, the 3rd city is Liverpool, and you know you want to center the prior for Liverpool at 1.0, then you can write
pm.Normal(mu=np.array([0, 0, 1, 0, 0]), sigma=sigma, dims='city'). I am not 100% sure on the Bambi syntax for something like this, someone with more experience would need to chime in there.
If your prior is over conditional information, as in your example that oranges in Liverpool sell better, you would tweak the mean of the item-given-city effect.
- If you want to make a prediction for a city you don’t have, and which isn’t in your model, you’re asking for too much. You have two options:
First, you could gather all the smaller cities in your dataset into a single “other cities” label, then use this label to generate predictions for unseen cities. The idea here is that smaller markets are all similar to each other and all different from the population mean.
Second, you could set all the city effects to zero. This would have the effect of drawing from the mean of the city effects distribution. You would be saying that “our best guess of the effect of an unknown city is the average city effect”.
Again, I apologize because I don’t know the Bambi-specific syntax here. Looking at the predict method I linked, it seems like it would involve setting include_group_specific = False, but I’m not sure if you could turn off city effects but leave on item effects, for example.