I have multi level data that I want to model to determine if there are differences between groups (data from separate years in my case). I want to know if the predictors are different and/or if the observed outcome (response) is different between years. Under normal circumstances, I could build a normal multi-level linear model. However, the predictors appear very nonlinearly related to the response when plotted individually (e.g., plt.scatter(data.feature1, date.observed)). I was trying to find a relatively straight forward way to do this while trying to preserve some interpretability in the model.

I thought of two possible approaches for this: BART or Generalized Additive Models (GAMs). Is there a way to add a hierarchical structure in a BART model to compare the predictors and posterior for the response across years? i.e., something like a partial pooled model where each year’s data is a level. The main reason I was thinking BART is because of the nonlinearity in the feature mapping and the partial dependence plots for some interpretability. If this is not possible, I was thinking of implementing a GAM, however, I am not certain how to do this as a multilevel model? I know how to implement a single GAM (statsmodels and pyGAM) but not a multilevel version. I know the GAM may be slightly outside of this forum, but I figured it wouldn’t hurt to ask.

Thanks!