Hi there,
I know this is a question on the FAQ, so please point me back there if there is something obvious that I missed.
I have a model that is generating a ParallelSamplingError: Bad initial energy
error. Here it is:
notifications_sent = 2250380
r_o = 687795
v_o_and_r_o = 26198
v_o = 248
with pm.Model() as reception_model:
alpha = pm.Beta('alpha', alpha=1, beta=1)
beta = pm.Beta('beta', alpha=1, beta=1)
gamma = pm.Beta('gamma', alpha=1, beta=1)
r = pm.Binomial('received', n=notifications_sent, p=alpha)
r_r = pm.Binomial('received_recorded', n=r, p=gamma, observed=r_o+v_o_and_r_o)
v = pm.Binomial('viewed', n=r, p=beta)
v_r = pm.Binomial('viewed_recorded', n=v, p=gamma, observed=v_o+v_o_and_r_o)
trace = pm.sample(1000, tune=500, chains=2)
The reception_model.check_test_point()
has -inf
on a few RVs as expected:
alpha_logodds__ -1.39
beta_logodds__ -1.39
gamma_logodds__ -1.39
received -5.23
viewed -4.88
received_recorded -inf
viewed_recorded -inf
Checking reception_model.test_point
reveals this:
{'alpha_logodds__': array(0., dtype=float32),
'beta_logodds__': array(0., dtype=float32),
'gamma_logodds__': array(0., dtype=float32),
'received': array(11078, dtype=int16),
'viewed': array(5539, dtype=int16)}
So the logodds for received_recorded
and viewed_recorded
result in -inf
because n for each Binomial is fewer than the observed values. Following a recommendation from a FAQ answer, one approach is to set the testval for received
and viewed
so that they are at least as large as the observed values. So I updated the model like so:
notifications_sent = 2250380
r_o = 687795
v_o_and_r_o = 26198
v_o = 248
with pm.Model() as reception_model:
alpha = pm.Beta('alpha', alpha=1, beta=1)
beta = pm.Beta('beta', alpha=1, beta=1)
gamma = pm.Beta('gamma', alpha=1, beta=1)
r = pm.Binomial('received', n=notifications_sent, p=alpha, testval=r_o+v_o_and_r_o+1000, dtype='int64')
r_r = pm.Binomial('received_recorded', n=r, p=gamma, observed=r_o+v_o_and_r_o, dtype='int64')
v = pm.Binomial('viewed', n=r, p=beta, testval=v_o+v_o_and_r_o+1000, dtype='int64')
v_r = pm.Binomial('viewed_recorded', n=v, p=gamma, observed=v_o+v_o_and_r_o, dtype='int64')
This makes things worse . reception_model.check_test_point()
returns
alpha_logodds__ -1.39
beta_logodds__ -1.39
gamma_logodds__ -1.39
received -inf
viewed -inf
received_recorded -inf
viewed_recorded -14734.69
the test points are:
{'alpha_logodds__': array(0., dtype=float32),
'beta_logodds__': array(0., dtype=float32),
'gamma_logodds__': array(0., dtype=float32),
'received': array(714993),
'viewed': array(27446)}
So I guess at this point my next step should be to re-think my priors or my model structure. But I’m not sure how to think through what’s going wrong with my current model. I’d appreciate any guidance on nexts steps. Hopefully it’s just something I missed in the FAQ!
Cheers,
Jesse