I have a three level hierarchical model that has:
- hyper-hyper parameters (for populations)
- hyper parameters (for sub-populations)
- measurements
The measurements are N(mu, sigma) where the mu and sigma are sampled from the hyper parameters.
The hyper-parameters for each sub-population are mu ~ N(hyper-mu, hyper-sigma) and sigma ~ Gamma(mu=hyper-sd-mean, sigma=hyper-sd-sd).
I have been having troubles with divergence, so first I reformulated the mus as recommended using
offset=pm.Normal('offset', mu=0, sd=1)
mu = pm.Deterministic('mu', hyper_mu + offset * hyper_sigma)
pm.Normal('measurement', mu=mu, sd=sd)
That worked, but didn’t solve my divergence problem. So I moved to the next step, and reformulated the sd
:
sigma_offset = pm.Normal('sigma offset', mu=0, sd=1)
sigma = pm.Deterministic('sigma', largest(hyper_sigma_mu + sigma_offset * hyper_sigma_sd, 0))
But when I added this my chains started failing, with this error:
ValueError: Bad initial energy: inf. The model might be misspecified.
It’s quite possible either (a) there’s a deep reason behind this which I don’t know or (b) I have done something stupid (like a typo).
Can anyone suggest either what is wrong with reformulating the standard deviation hyper-parameters like this or how to find whatever bug there is? I suppose the latter would be to somehow look at the probability function so I can see what is wrong with my code?
I’m lost here and would be very grateful for any guidance.