I’m trying to perform some inference on a hierarchical logistic regression model involving three nominal predictors. (I’m following along from the ch 21 notebook from Kruschke’s Doing Bayesian Data Analysis: https://github.com/JWarmenhoven/DBDA-python/blob/master/Notebooks/Chapter%2021.ipynb)
I’m able to sample using NUTS and the sampling seems reasonable, but it’s taking roughly 5 minutes to perform the inference on 84 data points with 18 total predictors and I was wondering if this was a typical performance speed.
I don’t know if the shape of the posterior is what is causing the problem since there doesn’t seem to be correlation between most of the variables from the scatter plots I produced.
Here is a scatter plot for one of the omega variables and the kappa variable (individual variance)
and a more typical plot between the coefficient and the scaling parameter
Attached is the python code I’m using to make the model and see below for the model itself for quick reference.
Any help is greatly appreciated.
with pm.Model() as model:
a_0 = pm.Normal(
'a_0', mu=0.0, tau=1 / (2 ** 2), shape=1
)
a_1_sigma = pm.HalfNormal('a_1_sigma', sd=10)
a_1_offset = pm.Normal(
'a_1_offset', mu=0, sd=1,
shape=x_1.eval().dtype.categories.size
)
a_1 = pm.Deterministic('a_1', 0.0 + a_1_offset * a_1_sigma)
a_3_sigma = pm.HalfNormal('a_3_sigma', sd=10)
a_3_offset = pm.Normal(
'a_3_offset', mu=0, sd=1,
shape=x_3.eval().dtype.categories.size
)
a_3 = pm.Deterministic('a_3', 0.0 + a_3_offset * a_3_sigma)
a_4_sigma = pm.HalfNormal('a_4_sigma', sd=10)
a_4_offset = pm.Normal(
'a_4_offset', mu=0, sd=1,
shape=x_4.eval().dtype.categories.size
)
a_4 = pm.Deterministic('a_4', 0.0 + a_4_offset * a_4_sigma)
# Parameters for categories
link_argument = (
a_0 +
a_1[x_1.eval().codes] +
a_3[x_3.eval().codes] +
a_4[x_4.eval().codes]
)
omega = pm.Deterministic('omega', pm.invlogit(link_argument))
kappa = pm.Gamma('kappa', 0.1, 0.1)
# Mean parameter for individual data (shape=84 for train data)
mu = pm.Beta(
'mu', alpha=omega * kappa + 1, beta=(1 - omega) * kappa + 1,
shape=x_1.eval().codes.size
)
likelihood = pm.Binomial('likelihood', p=mu, n=n, observed=y_obs)
# Rescale coefficients to be deflections from baseline
b_0 = pm.Deterministic('b_0', tt.mean(link_argument))
b_1 = pm.Deterministic('b_1', link_argument[x_1.eval().codes] - b_0)
b_3 = pm.Deterministic('b_3', link_argument[x_3.eval().codes] - b_0)
b_4 = pm.Deterministic('b_4', link_argument[x_4.eval().codes] - b_0)
hierarchical_logistic_regression.py (4.0 KB)
anonymized_data.csv (3.8 KB)