I try to model standard deviation, which obviously must be positive, but in the centered parameterization I get a handful of divergences. I tried to naively re-parameterize with sigma = pm.Deterministic('sigma', global_sigma + offset * global_sigma_sd)
, and got a Bad Initial Energy
error when sampling.
then in order to put a lower bound on the deterministic variable I tried with sigma.clip(0.0, np.inf)
and tt.switch(sigma<0, 0.0, sigma)
, but again I always get a Bad Initial Energy
error when sampling.
When instead of a 0-lower bound I tried 1e-15, with clip I get:
ValueError: Mass matrix contains zeros on the diagonal.
and with tt.switch:
The chain contains only diverging samples. The model is probably misspecified.
I tested with a latent variable:
offset = pm.Normal('offset', mu=0, sigma=1.0)
b_offset = pm.Bound(pm.Normal, lower=0.0)('bound_offset', mu=global_sigma + offset, sigma=1)
sigma = pm.Deterministic('sigma', b_offset * global_sigma_sd)
which gave no divergences and took 2-4 times longer to sample, but a + b * c != (a + b) * c
, so that doesn’t seem like a good idea.
How can this be solved? Is the best option to transform the data, e.g. square demeaned data and log-transform and then model mu instead of sigma?
This is related to this ongoing struggle.