I have a small dataset that’s relatively simple, but fitting a partial pooling model is running into a lot of divergences, even when trying to include non-centering (pymc3 version = 3.8).
I really want to leverage partial pooling since I have more features to include (but I wanted to start simple with just 3 features since I had divergences).
def hierarchical_normal(name, shape, μ=0., sig=1., centered=False):
if not centered:
Δ = pm.Normal('Δ_{}'.format(name), 0., 1., shape=shape)
σ = pm.HalfCauchy('σ_{}'.format(name), sig)
return pm.Deterministic(name, μ + Δ * σ)
else:
mu = pm.Normal('μ_{}'.format(name), μ, 1)
σ = pm.HalfCauchy('σ_{}'.format(name), sig)
return pm.Normal('α_{}'.format(name), mu, σ, shape=shape)
with pm.Model() as model1:
β0 = pm.Normal('β0', 0., 1., testval=df.yes.sum()/df['N'].sum())
α_age = hierarchical_normal('age', n_age, sig=3, centered=False)
α_income = hierarchical_normal('income', n_income, sig=3, centered=False)
α_gender = hierarchical_normal('gender', n_gender, sig=3, centered=False)
η = β0 \
+ α_age[age_]\
+ α_income[income_]\
+ α_gender[gender_]
p = pm.Deterministic('p', pm.math.invlogit(η))
obs = pm.Binomial('obs', df.N.values, p, observed=df.yes.values)
with model1:
trace = pm.sample(
init='advi',
draws=1000,
tune=5000,
target_accept=0.95,
random_seed=99
)
I’m surprised to see it fitting so poorly since its such a simple dataset.
trace plots below
And full code attached as a .py file
test.py (5.8 KB)