I have a doubt in Bayesian models. I am building a GLM model that describes the variables that are related to crime in a city. I am interested in the relation between the variables and the dependent variable (crime). I am fitting the data, not trying to predict future data.
Thus, should I use some regularized priors such as horseshoe?
- Regularization: I assume that only a small number of variables are related with crime. I am avoiding overfitting in prediction (I am not interested in predictions). The DIC is lower.
- no-regularization: I see all the coefficients in play with crime, all together. Perhaps the coeffcients of the model are high. The DIC is higher
Can you help me with this doubt? thx
This is hard to answer without a bit more info about the predictors, and I don’t have any experience crime stats. But just from intuition: Do something in-between. Horseshoe priors provide very strong regularisation. I can’t imagine how that would be appropriate, unless most of your predictors involve the second digit of the temperature on the opposite side of the earth. Lot’s of things have a high influence on crime.
half-student-t priors with nu=3 or so are usually a reasonable default to go for. I’d go through the predictors and try to figure out what is already known about those, and what values might possibly be reasonable. Keep in mind that the scaling of the data has an influence on that, so maybe rescale that first.
Let’s make a toy example. We have
1.Unemployment (positive correlation with crime)
2.Poverty (positive correlation with crime)
3.Density population (positive correlation with crime)
All these are correlated with eachother somehow, and they have a relation with crime. I could go for a model “using” all of them together with a Normal prior, horseshoe prior, or half-student-t. All these priors have different assumptions, but I don’t have a clear answer of what I should do. Moreover, the goodness of fit varies a lot with these priors.
What kinds of goodness of fit you are using?
If the posterior of the betas strongly depends on the prior, that means you dont have enough data and the posterior is dominated by your assumption. In this case I would suggest presenting all 3 models.
Other solution could be some kind of model averaging.
If you know that these things are correlated with crime ahead of time, why don’t you use informative priors on them? Strong regularization assumes you know ahead of time that most things won’t be predictive. If most things are expected to be predictive, that’s what the prior should say (at least if you goal is to have a good model).
Model comparison should also tell you which priors are most appropriate.
ok thanks. I think the problem is to have enough data
ok perfect. This is the assumption I did not find in the papers :))