quite often when I set up a hierarchical model in in pymc3 I encounter the following problem: My lower-level parameters (e.g., subject-level parameters) are drawn from higher-level distributions (e.g. group-level parameters). Now quite often I have parameters on the lower level that are not supposed to take negative values. In a non-hierarchical version, I can use direct priors that establish this bound. But how do I proceed in the hierarchical version, where I typically use Normal distributions between the group and subject level? Here are a few more detailed questions/ideas:
b) Should I rather use a log-Normal to link the subject and group level? How would I have to transform the parameters?
c) Or should I (log-?) transform the data?
I know the question is not pymc3 specific per se. But I believe there are some expert here who hopefully could help and I’d be very happy about some pymc3 specific solutions.
I run into this quite often; subject level parameters are only interpretable if they are positive.
One strategy I use is to model the logarithm of the population mean for these random variables and then have them come from a log-normal distribution. Here is an example from my own work on pharmacokinetics.
I usually need to model the volume of blood in a patient’s body. Obviously, this must be a positive number and can vary between patients, so a random effect is suitable. I may specify my generative model as follows
Since z is normal, then log_mu+z*sigma is normal, which means taking the expoenenital of this is expression yields a log-normal random variable. The exponential forces the parameters to be positive without having to resort to constraining priors via pm.Bound.
Yes, thank you, I think it might help. Just trying it out. I’m not yet sure about some practical aspects. e.g. do I have to exp(log_mu) to get my group-level mean? How does the transformation affect the prior I put on sigma …? Any hints highly welcome…
Under this parameterization, the random variable is log normal. You can leverage all the information about log normal random variables to find things like the expectation. See here https://en.wikipedia.org/wiki/Log-normal_distribution
So in this case, the population level mean would be exp(mu + sigma/2)
As for the prior on sigma, obviously it can’t be too large because the exponential will just make it worse. I would recommend you do some prior predictive checks.