How to chose beta for Half Cauchy distribution?

Hi,

I use a Half Cauchy prior for the standard deviation parameter of a Normal distribution in a hierarchical model. I saw that in most PyMC3’s examples, the parameter of this distribution is fixed to \beta=0.5.
However, I guess that this choice is not arbitrary and that it must be related in some way to the value that we expect to find for parameter it models (in that case a standard deviation).

What is the best strategy to choose it? And does it impact the inference?

Thanks!

Hi Timothé,
The choice of prior often depends on use-case, especially for hierarchical models – the different levels and the link function (if any) can modify the scientific sense of the prior. So the best pratice to determine your priors is to do prior predictive checks, with the very handy pm.prior_predictive_checks function.

You can also just sample from your priors and plug the sampled parameters into your linear model: I wrote a NB to demonstrate how to do that in a multinomial regression – which is usually more complicated than simple linear regression so you should be able to adapt the idea.

Note however that the half Cauchy has fat tails and can therefore disrupt inference when strong regularization is needed – thus, the exponential distribution tends to be more approriate for std priors.

Finally, how does it impact inference? Well… it depends. So the best is to test several priors and see how inference about your problem changes! Priors are not an oath – just a hypothesis that you can test however you want.

Hope this helps! PyMCheers

2 Likes

Hi Alex,
Thanks for your answer! I will look at the prior predictive checks idea!
I usually use a Jeffrey prior for variance parameters, when it is possible to do the computation analytically.
In the case of MCMC, it is not possible to use it because it is improper. But what do you think of the choice of the inverse gamma prior with very small \alpha and \beta? I believe that the limit of the inverse gamma is the Jeffrey’s prior when \alpha and \beta approach 0, isn’t it?

Hi Timothé,
I don’t know the inverse-Gamme very well, so, in general, I’d say that if you think it’s appropriate through your domain knowledge, then go for it!
My advice would just be to test it against an exponential prior, and see if inference changes. Coupled with prior pred checks and domain knowledge, this should give you a good idea of what an appropriate prior is in this case.

The maximum entropy principle could also help you decide. To quote McElreath (Statistical Rethinking 2nd ed., p.326):

“The exponential distribution has maximum entropy among all non-negative continuous distributions with the same average displacement. […] The gamma distribution has maximum entropy among all distributions with the same mean and same average logarithm.”

But I wouldn’t sweat on it, as Gamma and Exponential are related (the former is a sum of exp RVs).

1 Like