Dear Bayesians,
I am trying to understand the effect of a global hyper prior in a hierarchical model, which is just estimating p from a Bernoulli.
My data generating process has 3 groups with different p’s. We can imagine to have 3 age groups with different conversion rates.
Groups also differ in size: 2 groups are large (n=100), one is smaller (n=20). I get the following data average conversion rates:
0-49 0.29 # n=100
50-69 0.17 # n=100
70+ 0.05 # n=20
I want to compare how a hierarchical model deals with these numbers compared to a non-hierarchical model.
Hierarchical model:
age_group_idx, age_group_unique = pd.factorize(df_conversions["age_group"], sort=True)
COORDS = {"age_group": ["0-49", "50-69", "70+"]}
with pm.Model(coords=COORDS) as hierarchical_conversion_model:
alpha_hyper = pm.Gamma("alpha_hyper", alpha=2, beta=2)
beta_hyper = pm.Gamma("beta_hyper", alpha=2, beta=2)
p = pm.Beta("p", alpha=alpha_hyper, beta=beta_hyper, dims="age_group")
conversion_rate = pm.Bernoulli("conversion_rate", p=p[age_group_idx], observed=df_conversions["conversion"].values)
hierarchical_idata_conversions = pm.sample(1000, return_inferencedata=True)
Non-hierarchical model:
with pm.Model(coords=COORDS) as nonhierarchical_conversion_model:
p = pm.Beta("p", alpha=2, beta=2, dims="age_group")
conversion_rate = pm.Bernoulli("conversion_rate", p=p[age_group_idx], observed=df_conversions["conversion"].values)
nonhierarchical_idata_conversions = pm.sample(1000, return_inferencedata=True)
According to what I read about shrinkage, I would have expected the estimated conversion rate of the smallest group to be drawn in the direction of the other groups. However, the effect is the opposite:
Hierarchical model:
Non-hierarchical model:
Obviously, the hierarchical model catches the “observed conversion rate” of the 70+ age group much better, but given the small amount of data, it is not clear to me if this is good or bad. I would have expected the hierarchical model to see the 70+ data drawn in the direction of the global conversion rate (=shrinkage), but this is clearly not the case. Why is that?
Best regards and a happy christmas!
Matthias