Best logistic model structure for boolean covariates and interactions

john_c · February 22, 2021, 9:32pm

I have a simple model where I’m trying to identify whether someone will answer yes/no to a question best on whether they are employed (0/1), married (0/1), have kids (0/1), age (continuous), and 5 other domain specific boolean covariates. What would be the best way to specify an interaction model such as this?

I definitely understand how to specify an interaction between age and other features, however I don’t know an efficient way to specify an interaction between the categorical binary features (i.e. being married AND having kids AND employed may have an interaction effect). There are 1024 combinations between the boolean features which makes things particularly complicated. Any ideas?

jonsedar · March 8, 2021, 7:06am

To have so many interaction terms seems a little impractical from an inference standpoint too: do you have reason to think that a + b + c + a:b + b:c + a:c + a:b:c etc etc. would capture more info than simple linear independence a + b + c?

If you still want to enforce / measure the correlation between coefficients I suppose you could try pulling them from a correlated MvNormal similar to what McElreath does here: http://xcelab.net/rmpubs/Mcelreath%20Koster%202014.pdf

Topic		Replies	Views
Variance components/Intraclass correlations - complex interaction hierarchy	0	13	May 12, 2025
How to model/handle hierarchical features in logistic regression version agnostic modeling	0	368	September 25, 2023
Highly correlated variables v5 bambi , modeling	3	519	January 3, 2023
Building the linear equation of a logistic regression model Questions	13	1529	December 5, 2018
Constrain likelihood based on covariates v5 prior , modeling	3	51	January 9, 2025

Best logistic model structure for boolean covariates and interactions

Related topics