Build model with varying combination of RVs

At first blush this seems like a case of premature optimization. This is because BLAS routines are some of the most optimized, highest performance code out there. To beat X @ beta, you really need to be in a special situation. Do you have a working model that does it the “wasteful” way and has unacceptably slow performance that you can share?

If the feature matrix is very sparse, you could try going that route as an alternative. I think you need something like 5% or fewer non-zero entries before it’s “worth it”.