Toying around with BART and possible ideas to try I came across this article:
with the following statement:
Decision trees based algorithms (Random Forest, Gradient Boosted Trees, XGBoost) build their split rules according to one feature at a time. This means that they will fail to process these two features simultaneously whereas the cos/sin values are expected to be considered as one single coordinates system.
And with one of the answers in the comments being this -
I ended up using CatBoost as their algorithms handel categorical variables natively.
So naturally I was wondering how this applies for BART?
Can we use sin and cos - as it is possible in catboost - or would it be a bad idea? And if we can use it - what would be a possible way forward? I think it would be quite fun to try it out.
I am aware that in the Structural AR is used Fourier Transform. This is a question out of interest.
This is an interesting question, and this gives me an idea for creating tests for some features we are developing for BART (unrelated to circular variables). So double thank you!
Going to your question. Currently, it’s not possible to tell PyMC-BART that two or more features must be part of the same tree. But in principle, there is no reason that BART could not automatically learn that. So, probably this is more a problem of how to incorporate useful prior knowledge and not a problem of BART (or other tree methods) failing. It should not be difficult to adapt the code to have this feature. Do you know of other scenarios where one will want to “force an interaction”? Also wondering if we should generalize this to prior probabilities (not just 1), something like the prior probability of these two variables interacting is 0.7…
Thank you so much! It makes me really proud if the question is of use for BART. The only similar use case with forced interaction I can imagine would be seasonal decompose from statsmodels or related methods like spectral analysis but there one would need to handle the trend component with first differentiation for stationarity in addition. Yes one would put more than one prior I guess but the best value is another question.