Hi, I’ve been working on a negative binomial regression in PyMC. Attached is a simplified version of my model (in reality I have more parameters in the linear equation, and a multilevel component).
with pm.Model() as model:
x_1 = pm.MutableData(
"x_1", X_train['x_1']
)
exposure = pm.MutableData("exposure", X_train["exposure"])
y_obs = pm.MutableData("y_obs", y_train["y"])
intercept_mu = pm.Normal(
"intercept_mu", 0, 10
)
beta_mu_x1 = pm.Normal("beta_mu_x1", 0, 5)
mu = pm.Deterministic(
"mu",
pm.math.exp(
intercept_mu
+ x_1 * beta_mu_x1
)
* exposure,
)
intercept_alpha = pm.Normal(
"intercept_alpha", 0, 10
)
beta_alpha_x1 = pm.Normal("beta_alpha_x1", 0, 5)
alpha = pm.Deterministic(
"alpha",
pm.math.exp(
intercept_alpha
+ x_1 * beta_alpha_x1
)
)
y = pm.NegativeBinomial("y", mu=mu, alpha=alpha, observed=y_obs)
As you can see, I have quite a lot of data and sampling is quite slow. With the full parameterisation it takes c. 1h30 to fit, the results look reasonable though.
I would appreciate if someone could just check that the model is sound for my peace of mind!