What is a good strategy to model individual noise parameters hierarchically (with partial pooling)?


I am trying to fit a noise/scale parameter to individuals, allowing for some partial pooling.

I thought about doing something analogous to the normal non-centered parameterization for location parameters. As values must be positive, I try to work in log-scale and exponentiate the result of the operation, but I am not sure if this makes sense (and if it does, whether I am doing it correctly):

noise_pop_log = pm.Normal('noise_pop_log', mu=0, sd=5)   # The unit scale of the data is rather large, hence the sd=5 in log-scale
noise_pop_spread = pm.HalfNormal('noise_pop_spread', sd=1)

noise_individual_offset = pm.Normal('noise_individual_offset', 0, 1, shape=n_individuals)
noise_individual = pm.Deterministic('noise_individual', pm.math.exp(noise_pop_log + noise_pop_spread * noise_individual_offset))

Does anyone have a good opinion on this? What else would you suggest?

That seems like a reasonable starting point. Here’s an article on doing this for obtaining individual noise parameters via regression. Note that this author is using the softplus transformation which may have better numerical stability for large argument values. Also, the the reason we use the noncentered parameterization for regression coefficients is (usually) to deal with collinear covariates or weakly identifiable parameters. The same issues are usually not as severe with variance parameters. I think you could also simply add together positive random variables or do something like \sigma_i \sim HalfNormal(\alpha), \alpha \sim HalfNormal(10).


Thank you for the clarification!