Rule extraction with Bayesian Logistic Regression

The horseshoe prior is also a continuous form of the spike and slab. The only problem is that if you have a continuous prior, you have a continuous posterior, so there’s no way to get non-zero probability mass at zero. Oddly, the paper you linked doesn’t seem to mention this. If all you care about is predictive performance, shrinking to nearly zero is good enough (after you take scale of covariates into account). But if you have a bajillion covariates and want to trim them for run-time speed, there’s a post-processing step you need to do.

2 Likes