I am trying to fit a regression model to estimate a percentage, 'sleepEf'
(scaled to 0:1) from a single predictor variable 'tst'
in pyMC3. Given that a percentage is bounded, I believe I should use a beta distributed outcome variable with a logit transform. Here is my model in pyMC3:
with pm.Model() as pooled_model:
b_intr = pm.Normal('b_intr', mu=0.8, sd=100 ** 2)
b_tst = pm.Normal('b_tst', mu=0.0, sd=100 ** 2)
model_err = pm.HalfNormal('model_err', sd=50)
# Expected value
y_est = pm.math.invlogit(b_intr + \
b_tst * data['tst'])
# Data likelihood
y_like = pm.Beta('y_like', mu=y_est, sd=model_err, observed=data['sleepEf'])
I have made sure that the sleepEf
variable is never exactly 0 or 1. Here is a histogram of it:
When I try to fit the MAP estimate and then sample I get errors:
with pooled_model:
start = pm.find_MAP(fmin=sp.optimize.fmin_powell)
hierarchical_trace = pm.sample(2000, step=pm.Metropolis(), start=start, tune=1000)
ValueError: Optimization error: max, logp or dlogp at max have
non-finite values. Some values may be outside of distribution support.
max: {âb_intrâ: array(3.3879289615458417), âb_tstâ:
array(2.587928961545842)} logp: array(-inf) dlogp: array([
-2.58792896e-08, -2.58792896e-08])Check that 1) you donât have hierarchical parameters, these will lead to points with infinite
density. 2) your distribution logpâs are properly specified. Specific
issues:
Have I incorrectly specified the model?
Also note that if I do not find_MAP before sampling, my parameter values never change from their prior values.
Thanks!
I have also asked this question here: