It depends on how the variables are codified. For example, if superClusterID==1 can happen within two different countries, i.e. codes are not unique, then you have to specify the nestedness. If codes are unique for all variables, there’s no need to worry about it. Here you’re going to find valuable information mixed model - Crossed vs nested random effects: how do they differ and how are they specified correctly in lme4? - Cross Validated (stackexchange.com).
On the other hand, Bambi relies on formulae to generate design matrices, which turns out to generate a regular matrix for group-specific effects. In this case, it’s a very large matrix with almost all zeros, so a sparse matrix would have been better. So I think this explains the large memory comsumption. It is something we still need to improve on our end. However, this matrix is not directly used in the PyMC model, we use slicing to select only non-zero values.
What could be good in this case, is starting from the Bambi model and then trying to replicate that in PyMC, where you can have more control on all the fine details.
These are unfinished ideas, but it may help you to understand how to map a Bambi model to a PyMC one.
- unfinished-ideas/sparse.ipynb at main · bambinos/unfinished-ideas (github.com)
- unfinished-ideas/sparse_2.ipynb at main · bambinos/unfinished-ideas (github.com)
Edit: Notice the examples rely on an older version of Bambi which used PyMC3. Almost all would be identical now, but be aware you need to use Aesara instead of Theano and PyMC instead of PyMC3.