Thanks for this. Breaking the observed data into the bernoulli and gamma parts is a fine workaround and aligns with approaches I’ve taken in stan before.
+1 to making the priors extremely tight around the true values. All this does is make the sampling longer, but the samples are still divergent.
Would you like me to open an issue on github?