Prior choice for discrete features - logistic regression

AlexAndorra · December 29, 2022, 7:24pm

That means you want the probability of a malicious flag (the Bernoulli’s probability of success) to increase when invoke increases, which means you’re expecting the coefficient on that predictor to be skewed towards positive values. You can just encode that knowledge into the mean of your Normal prior. This reasoning applies for any of the predictors you have.

Then you need to do prior and posterior predictive checks to see how (and even if) these changes affect your model and results.

More generally, I’m skeptical that you need 2K predictors to have a good model. I would encourage you to try your hand on a way more parcimonious model first. That’s because:

It’ll be much easier to reason about it (I personally can’t hold 2k predictors in my head and explain how they independently influence the probability of success).
It’ll probably make fitting more efficient – my prior is that lots of these 2k predictors are strongly auto-correlated, i.e they give you redundant information.
From a scientific standpoint, my guess for now is that you don’t need 2k predictors to explain this phenomenon

Setting constant=0 means that the probability of success in that case is expected to be 50% (because sigmoid(0) = 0.5), so that’s actually not what you want.
I would keep the intercept (it’s usually very helpful in any regression setting), and set a negative prior on it, to encode your knowledge that probability of success in those cases is close to 0. If your model infers that indeed the intercept is (very) negative, then that’s a good sign, based on your domain knowledge.

Hope this helps

Topic		Replies	Views
Samples from prior appear to have wrong distribution Questions	2	582	October 8, 2018
Using prior distributions in pymc3 Questions development	0	659	May 30, 2019
Rule extraction with Bayesian Logistic Regression modeling	4	49	April 7, 2025
Can someone help me model this data...? version agnostic modeling	3	970	September 25, 2022
I want Identifying Pre Selection in PyMC3 Bayesian Modeling	1	76	July 22, 2024

Prior choice for discrete features - logistic regression

Related topics