I just built a hierarchical binomial regression with nesting and was hoping I could get some feedback on it. I thought I specified this model to have partial pooling and shrinkage but when I run simulations it doesn’t look like there’s shrinkage towards the mean. This is the first model I’ve gone and built on my own so it’d be nice to have a sanity check. The model is below:
𝑐𝑜𝑛𝑣𝑒𝑟𝑠𝑖𝑜𝑛𝑠∼𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝑁,𝑝)
𝑙𝑜𝑔𝑖𝑡(𝑝) = 𝛼_𝑠𝑜𝑢𝑟𝑐𝑒[𝑖]
𝛼_𝑠𝑜𝑢𝑟𝑐𝑒∼𝑁𝑜𝑟𝑚𝑎𝑙(𝛼_𝑐ℎ𝑎𝑛𝑛𝑒𝑙[𝑖], 𝜎_𝑠)
𝛼_𝑐ℎ𝑎𝑛𝑛𝑒𝑙∼𝑁𝑜𝑟𝑚𝑎𝑙(𝛼, 𝜎_𝑐ℎ)
𝛼∼𝑁𝑜𝑟𝑚𝑎𝑙(0,1.5)
𝜎_𝑠∼𝐸𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙(1)
𝜎_𝑐ℎ∼𝐸𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙(1)
The data is setup as the following:
- There are advertising channels
- Nested within advertising channels are advertising sources
- Each row of the dataframe is a unique source. on that row, there is a number of conversions, and a number of trials, N for that source.
I.e.:
[[‘channel’, ‘source’, ‘conversions’, ‘traffic’], [channel_1, source_1, 10, 100], [channel_1, source_2, 8, 90]]
Am I specifying something wrong leading to a lack of shrinkage? Or maybe I just have so many observations that theres less shrinkage? Or maybe shrinkage is a more subtle feature than I assumed?
I’ve also attached an image that is admittedly a little sloppy (the code is too), where each channel is compartmentalized between vertical dashed lines, the solid blue lines are the predicted means of a channel, the solid red lines are the true conversion rate of a channel (not the empirical conversion rate), and the horizontal blue dashed line is the true mean conversion rate for channels (not the empirical)
example.py (2.8 KB)