# Hierarchical Logistic Regression. How to reparameterize?

I have a simple case with 30 groups(NETWORKS) and only 1 RV(EXP_FLAG) where dependent variable binary class (0 or 1). 65100 rows × 3 columns

#split variables and define network idx
x, y = data_tv_s.EXP_FLAG, data_tv_s.CONV_FLAG
network_idx = data_tv_s[‘NETWORK_IDX’].values

Building Individual (unpooled) model:
with pm.Model() as unpooled_model:

``````# priors
alpha = pm.Normal('alpha', mu=0, sd=10, shape =len(data_tv_s.NETWORK_IDX.unique()))
beta = pm.Normal('beta', mu=0, sd=10, shape =len(data_tv_s.NETWORK_IDX.unique()))

m = alpha[network_idx] + pm.math.dot(x, beta[network_idx])

# inverse link function with alias for the Theano function
theta = pm.Deterministic('theta', pm.math.sigmoid(m))

# boundary decision, which is the value used to separate class of the target
bd = pm.Deterministic('bd', -alpha[network_idx]/beta[network_idx])

conv = pm.Bernoulli('conv', p=theta, observed=y)

priors_unpooled = pm.sample_prior_predictive()

# posterior/create the trace
trace_unpooled = pm.sample(chains= 2, target_accept = 0.95)
``````

Takes 10hr to run and results are following:

Then I’m building HIERARCHICAL with 4chains:
with pm.Model() as hr_model:

``````# hyper-priors
alpha_m = pm.Normal('alpha_m', mu=0, sd=10)
alpha_sigma = pm.HalfNormal('alpha_sigma', sd= 10)
beta_m = pm.Normal('beta_m', mu=0, sd=10)
beta_sigma = pm.HalfNormal('beta_sigma', sd=10)

# priors
alpha = pm.Normal('alpha',
mu=alpha_m,
sd=alpha_sigma,
shape =len(data_tv_s.NETWORK_IDX.unique()))
beta = pm.Normal('beta',
mu=beta_m,
sd=beta_sigma,
shape =len(data_tv_s.NETWORK_IDX.unique()))

m = alpha[network_idx] + pm.math.dot(x, beta[network_idx])

# inverse link function with alias for the Theano function
theta = pm.Deterministic('theta', pm.math.sigmoid(m))

# boundary decision, which is the value used to separate class of the target
bd = pm.Deterministic('bd', -alpha[network_idx]/beta[network_idx])

conv = pm.Bernoulli('conv', p=theta, observed=y)

priors_hr = pm.sample_prior_predictive()

# posterior/create the trace
trace_hr = pm.sample(chains= 4, target_accept = 0.95)
``````

Takes for me 20 Hours to run with results:

r_hat is so high and each chain reached the maximum tree depth. What do I do wrong in here ?
My forest for betas in both unpooled and Hierarchical models for each group has wide HDI interval. Following images are from Hierarchical models:

Intercept and Variance in between groups:

You could try to fit this model using Bambi.

The non-hierarchical version

``````model = bmb.Model(
"CONV_FLAG ~ EXP_FLAG + NETWORK_IDX",
data_tv_s,
family="bernoulli"
)
idata = model.fit()
``````

The hierarchical version (i.e. varying intercept and varying slopes)

``````model = bmb.Model(
"CONV_FLAG ~ EXP_FLAG + NETWORK_IDX + (EXP_FLAG|NETWORK_IDX)",
data_tv_s,
family="bernoulli"
)
idata = model.fit()
``````

It can also be good to select a sub-sample of the data in the first attempts to fit the model to detect possible improvements faster.

1 Like