About GLM: Logistic Regression example in Tutorials

EduardoCabria · April 11, 2021, 11:47am

Hello, I’ m studying this case: GLM: Logistic Regression — PyMC3 3.10.0 documentation

I have a doubt about the interpretation of this illustration, can someone help me to see what this confidence interval (1.378 to 1.413) really means with respect to the education trace ? And why is supposed to help us to clarify the model?

Thanks!

ricardoV94 · April 11, 2021, 1:18pm

Is your question about the meaning of confidence interval or the variable in question (ie.odds ratio of educ coefficient)?

Confidence interval just means that given your model and data it is plausible that the true value for that variable lies in that range.

EduardoCabria · April 11, 2021, 1:35pm

Thankyou for the answer @ricardoV94

And what is the objective of analyzing the odds ratio in educ coefficient?

ricardoV94 · April 11, 2021, 1:44pm

The odds ratio is in a sense the “scale” where variables live in a logistic regression.

Whereas in a linear regression your can interpret a coefficient of say 1.5 as saying that for every increase in 1 unit of your predictor variable (e.g. years of education), there is a 1.5 increase in your predicted variable (e.g. raw income), in a logistic regression things are a bit different.

Here you are predicting binary outcomes (0-1, e.g., income above or below 50k) and an exponentiated coefficient of 1.5 says that for every increase in 1 unit of your predictor variable there is a 1.5 increase in the ”odds" (i.e, relative likelihood or p/(1-p)) of your predicted variable being 1 (ie income > 50k) instead of 0. Read another way, it says that for every extra year of education you are ~1.5x more likely to make > 50k than not, compared to someone with one year less of education.

EduardoCabria · April 11, 2021, 3:28pm

Thankyou very much for your responde @ricardoV94 You solved this for me!

Although I have one last doubt, in order to get these odds ratios, should the exponential of the traces obtained after the model inference always be done, right?

ricardoV94 · April 11, 2021, 3:42pm

Yes. You exponentiate (the traces of) your coefficients to obtain the respective odds ratio they imply. This is because the logistic regression works with the logarithm of the odds and not with the odds directly.

Note that it’s often useful to check what the model means in terms of probabilities, by comparing the predictions when x=a vs x=b. This is usually more intuitive.

EduardoCabria · April 11, 2021, 3:50pm

Do you mean for example what is done in the same example above? (attached)

Thankyou so much Ricardo, you are helping me a lot.

ricardoV94 · April 11, 2021, 4:10pm

Exactly

Topic		Replies	Views
Comparing LOO-scores from a fit via the GLM module with other models Questions	9	988	January 11, 2021
Inverse-link transformation on coefficients of GLMs Questions	8	2565	October 1, 2017
Explainability of pymc predictions: which features and which direction version agnostic modeling , sampling , arviz	7	205	November 10, 2024
Error in interpretation on the Multilevel Modeling Notebook Questions	1	413	September 30, 2018
Logistic ANOVA: different results with statsmodels and PyMC3 Questions	3	1948	October 10, 2019

About GLM: Logistic Regression example in Tutorials

Related topics