Hierarchical Logistic Regression. How to reparameterize?

vanko · October 25, 2021, 3:39pm

I have a simple case with 30 groups(NETWORKS) and only 1 RV(EXP_FLAG) where dependent variable binary class (0 or 1). 65100 rows × 3 columns

#split variables and define network idx
x, y = data_tv_s.EXP_FLAG, data_tv_s.CONV_FLAG
network_idx = data_tv_s[‘NETWORK_IDX’].values

Building Individual (unpooled) model:
with pm.Model() as unpooled_model:

# priors
alpha = pm.Normal('alpha', mu=0, sd=10, shape =len(data_tv_s.NETWORK_IDX.unique()))
beta = pm.Normal('beta', mu=0, sd=10, shape =len(data_tv_s.NETWORK_IDX.unique()))

m = alpha[network_idx] + pm.math.dot(x, beta[network_idx])

# inverse link function with alias for the Theano function
theta = pm.Deterministic('theta', pm.math.sigmoid(m)) 

# boundary decision, which is the value used to separate class of the target
bd = pm.Deterministic('bd', -alpha[network_idx]/beta[network_idx])

conv = pm.Bernoulli('conv', p=theta, observed=y)

priors_unpooled = pm.sample_prior_predictive()

# posterior/create the trace
trace_unpooled = pm.sample(chains= 2, target_accept = 0.95)

Takes 10hr to run and results are following:

Then I’m building HIERARCHICAL with 4chains:
with pm.Model() as hr_model:

# hyper-priors
alpha_m = pm.Normal('alpha_m', mu=0, sd=10)
alpha_sigma = pm.HalfNormal('alpha_sigma', sd= 10)
beta_m = pm.Normal('beta_m', mu=0, sd=10)
beta_sigma = pm.HalfNormal('beta_sigma', sd=10)


# priors
alpha = pm.Normal('alpha', 
                  mu=alpha_m, 
                  sd=alpha_sigma, 
                  shape =len(data_tv_s.NETWORK_IDX.unique()))
beta = pm.Normal('beta', 
                 mu=beta_m, 
                 sd=beta_sigma, 
                 shape =len(data_tv_s.NETWORK_IDX.unique()))

m = alpha[network_idx] + pm.math.dot(x, beta[network_idx])

# inverse link function with alias for the Theano function
theta = pm.Deterministic('theta', pm.math.sigmoid(m)) 

# boundary decision, which is the value used to separate class of the target
bd = pm.Deterministic('bd', -alpha[network_idx]/beta[network_idx])

conv = pm.Bernoulli('conv', p=theta, observed=y)

priors_hr = pm.sample_prior_predictive()

# posterior/create the trace
trace_hr = pm.sample(chains= 4, target_accept = 0.95)

Takes for me 20 Hours to run with results:

r_hat is so high and each chain reached the maximum tree depth. What do I do wrong in here ?
My forest for betas in both unpooled and Hierarchical models for each group has wide HDI interval. Following images are from Hierarchical models:

Intercept and Variance in between groups:

tcapretto · October 26, 2021, 9:10pm

You could try to fit this model using Bambi.

The non-hierarchical version

model = bmb.Model(
  "CONV_FLAG ~ EXP_FLAG + NETWORK_IDX", 
  data_tv_s,
  family="bernoulli"
)
idata = model.fit()

The hierarchical version (i.e. varying intercept and varying slopes)

model = bmb.Model(
  "CONV_FLAG ~ EXP_FLAG + NETWORK_IDX + (EXP_FLAG|NETWORK_IDX)", 
  data_tv_s,
  family="bernoulli"
)
idata = model.fit()

It can also be good to select a sub-sample of the data in the first attempts to fit the model to detect possible improvements faster.

Topic		Replies	Views
Hierarchical Logistic Regression with Binomial Likelihood - The chain reached the maximum tree depth, The rhat statistic is larger than 1.05 for some parameters, The estimated number of effective samples is smaller than 200 for some parameters Questions difficult_inference	2	741	December 16, 2021
Hierarchical logistic regression Sharing	17	4424	October 28, 2019
Reparameterization of 3-level hierarchical bayesian model Questions	9	2284	July 12, 2022
Feasibility of large hierarchical multiple regression v5 modeling	4	284	October 24, 2023
Hierarchical Logistic Regression sampling problems Questions	4	1803	January 18, 2018

Hierarchical Logistic Regression. How to reparameterize?

Related topics