Feedback on my first hierarchical bayesian model

I just built a hierarchical binomial regression with nesting and was hoping I could get some feedback on it. I thought I specified this model to have partial pooling and shrinkage but when I run simulations it doesn’t look like there’s shrinkage towards the mean. This is the first model I’ve gone and built on my own so it’d be nice to have a sanity check. The model is below:

𝑐𝑜𝑛𝑣𝑒𝑟𝑠𝑖𝑜𝑛𝑠∼𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝑁,𝑝)
𝑙𝑜𝑔𝑖𝑡(𝑝) = 𝛼_𝑠𝑜𝑢𝑟𝑐𝑒[𝑖]
𝛼_𝑠𝑜𝑢𝑟𝑐𝑒∼𝑁𝑜𝑟𝑚𝑎𝑙(𝛼_𝑐ℎ𝑎𝑛𝑛𝑒𝑙[𝑖], 𝜎_𝑠)
𝛼_𝑐ℎ𝑎𝑛𝑛𝑒𝑙∼𝑁𝑜𝑟𝑚𝑎𝑙(𝛼, 𝜎_𝑐ℎ)
𝛼∼𝑁𝑜𝑟𝑚𝑎𝑙(0,1.5)
𝜎_𝑠∼𝐸𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙(1)
𝜎_𝑐ℎ∼𝐸𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙(1)

The data is setup as the following:

  • There are advertising channels
  • Nested within advertising channels are advertising sources
  • Each row of the dataframe is a unique source. on that row, there is a number of conversions, and a number of trials, N for that source.

I.e.:
[[‘channel’, ‘source’, ‘conversions’, ‘traffic’], [channel_1, source_1, 10, 100], [channel_1, source_2, 8, 90]]

Am I specifying something wrong leading to a lack of shrinkage? Or maybe I just have so many observations that theres less shrinkage? Or maybe shrinkage is a more subtle feature than I assumed?

I’ve also attached an image that is admittedly a little sloppy (the code is too), where each channel is compartmentalized between vertical dashed lines, the solid blue lines are the predicted means of a channel, the solid red lines are the true conversion rate of a channel (not the empirical conversion rate), and the horizontal blue dashed line is the true mean conversion rate for channels (not the empirical)

example.py (2.8 KB)

1 Like

Hi John,
And welcome :slight_smile:

At first glance, I don’t see obvious issues in the model you shared. To check for shrinkage though, I think the best would be to compare the hierarchical estimates with the empirical estimates (or those from a no-pooling model), not with the true rates. And since you’ve got two levels of hierarchies, you should do that for both levels separately.

To illustrate this worfklow, you can check-out the updated radon example NB (not yet on the website but on the master branch).

Hope this helps :vulcan_salute:

1 Like

Good call that was a slight hiccup on my part using the true conversion rate instead of the empirical one - looks much better now!

Thanks for sending that link my way, I spent more time than I’d care to admit figuring out how to access information from objects and change plotting elements with PyMC3/arviz. I wish I had seen this example sooner, looks great!

1 Like

You’re welcome! And don’t worry, that’s quite normal to have a hard time with all the different dimensions and parameters – this stuff is hard and demands time, practice and perseverance :wink: