Hi all,
I very consistently have an issue trying to introduce random effects into models like binomial regression. My model samples fine without the introduction of the random effect, but when I try to introduce a zero sum normal random effect for an indicator variable, as follows:
preseason_idx, preseason_indicators= pd.factorize(pandas_data['preseason_only'])
coords = {'kicker' : kickers, 'preseason_dummy': preseason_indicators}
with pm.Model(coords = coords) as preseason_test_model:
beta_params = pm.find_constrained_prior(
pm.Beta,
lower=0.7,
upper= 1,
mass = .9,
init_guess={"alpha": 10, "beta": 3},
)
p = pm.Beta('p',**beta_params,dims='kicker')
mu_preseason = pm.ZeroSumNormal('mu_preseason', sigma = 1, dims = 'preseason_dummy')
p_adj = pm.Deterministic('p_adj', p[kickers_idx] + pm.math.invlogit(mu_preseason[preseason_idx]))
y = pm.Binomial('y', n = attempts, p = p_adj, observed = makes, dims = 'kicker')
with preseason_test_model:
preseason_test_model_trace = pm.sample_prior_predictive(1000)
I get the following error:
ValueError: p < 0, p > 1 or p contains NaNs. Apply node that caused the error: binomial_rv ....
I understand what is happening (or think I do!), i.e. that the inverse logit of the zero sum normal will push ‘p_adj’ outside of the unit interval. Obviously this beta distribution is concentrated close to 1 which could influence that, but I’m mostly following examples like this notebook by Chris Fonnesbeck about hierarchial models wherein there are several variables fit like this. I have tried different iterations of logit and inverse logit, and tried this with several models in the past and always get this error. Any help would be greatly appreciated!