About GLM: Logistic Regression example in Tutorials

Hello, I’ m studying this case: GLM: Logistic Regression — PyMC3 3.10.0 documentation

I have a doubt about the interpretation of this illustration, can someone help me to see what this confidence interval (1.378 to 1.413) really means with respect to the education trace ? And why is supposed to help us to clarify the model?

Thanks!

Is your question about the meaning of confidence interval or the variable in question (ie.odds ratio of educ coefficient)?

Confidence interval just means that given your model and data it is plausible that the true value for that variable lies in that range.

1 Like

Thankyou for the answer @ricardoV94

And what is the objective of analyzing the odds ratio in educ coefficient?

The odds ratio is in a sense the “scale” where variables live in a logistic regression.

Whereas in a linear regression your can interpret a coefficient of say 1.5 as saying that for every increase in 1 unit of your predictor variable (e.g. years of education), there is a 1.5 increase in your predicted variable (e.g. raw income), in a logistic regression things are a bit different.

Here you are predicting binary outcomes (0-1, e.g., income above or below 50k) and an exponentiated coefficient of 1.5 says that for every increase in 1 unit of your predictor variable there is a 1.5 increase in the ”odds" (i.e, relative likelihood or p/(1-p)) of your predicted variable being 1 (ie income > 50k) instead of 0. Read another way, it says that for every extra year of education you are ~1.5x more likely to make > 50k than not, compared to someone with one year less of education.

2 Likes

Thankyou very much for your responde @ricardoV94 You solved this for me!

Although I have one last doubt, in order to get these odds ratios, should the exponential of the traces obtained after the model inference always be done, right?

Yes. You exponentiate (the traces of) your coefficients to obtain the respective odds ratio they imply. This is because the logistic regression works with the logarithm of the odds and not with the odds directly.

Note that it’s often useful to check what the model means in terms of probabilities, by comparing the predictions when x=a vs x=b. This is usually more intuitive.

Do you mean for example what is done in the same example above? (attached)

Thankyou so much Ricardo, you are helping me a lot.

Exactly