How to chose beta for Half Cauchy distribution?

tboutelier · November 7, 2019, 4:52pm

Hi,

I use a Half Cauchy prior for the standard deviation parameter of a Normal distribution in a hierarchical model. I saw that in most PyMC3’s examples, the parameter of this distribution is fixed to \beta=0.5.
However, I guess that this choice is not arbitrary and that it must be related in some way to the value that we expect to find for parameter it models (in that case a standard deviation).

What is the best strategy to choose it? And does it impact the inference?

Thanks!

AlexAndorra · November 7, 2019, 7:39pm

Hi Timothé,
The choice of prior often depends on use-case, especially for hierarchical models – the different levels and the link function (if any) can modify the scientific sense of the prior. So the best pratice to determine your priors is to do prior predictive checks, with the very handy pm.prior_predictive_checks function.

You can also just sample from your priors and plug the sampled parameters into your linear model: I wrote a NB to demonstrate how to do that in a multinomial regression – which is usually more complicated than simple linear regression so you should be able to adapt the idea.

Note however that the half Cauchy has fat tails and can therefore disrupt inference when strong regularization is needed – thus, the exponential distribution tends to be more approriate for std priors.

Finally, how does it impact inference? Well… it depends. So the best is to test several priors and see how inference about your problem changes! Priors are not an oath – just a hypothesis that you can test however you want.

Hope this helps! PyMCheers

tboutelier · November 8, 2019, 8:17am

Hi Alex,
Thanks for your answer! I will look at the prior predictive checks idea!
I usually use a Jeffrey prior for variance parameters, when it is possible to do the computation analytically.
In the case of MCMC, it is not possible to use it because it is improper. But what do you think of the choice of the inverse gamma prior with very small \alpha and \beta? I believe that the limit of the inverse gamma is the Jeffrey’s prior when \alpha and \beta approach 0, isn’t it?

AlexAndorra · November 8, 2019, 12:17pm

Hi Timothé,
I don’t know the inverse-Gamme very well, so, in general, I’d say that if you think it’s appropriate through your domain knowledge, then go for it!
My advice would just be to test it against an exponential prior, and see if inference changes. Coupled with prior pred checks and domain knowledge, this should give you a good idea of what an appropriate prior is in this case.

The maximum entropy principle could also help you decide. To quote McElreath (Statistical Rethinking 2nd ed., p.326):

“The exponential distribution has maximum entropy among all non-negative continuous distributions with the same average displacement. […] The gamma distribution has maximum entropy among all distributions with the same mean and same average logarithm.”

But I wouldn’t sweat on it, as Gamma and Exponential are related (the former is a sum of exp RVs).

tboutelier · November 8, 2019, 2:16pm

Hi Alex, thanks for your comments!

Topic		Replies	Views
Effective selection of prior distributions Questions	2	952	January 8, 2020
Selecting prior distributions for Cauchy as likelihood distibution Questions prior	5	1022	January 11, 2022
Problems in defining weakly informative priors for a categorical multilevel mixture model v5 modeling	8	70	February 6, 2025
Prior distribution for "certainty" parameter of Beta distribution Questions	5	2028	April 10, 2018
Hierarchical logistic regression giving non-sensible results? Do I formulate it correctly? Questions	15	846	August 4, 2020

How to chose beta for Half Cauchy distribution?

Related topics