then are ~3000 parameter models feasible to estimate in PyMC3?
Sure. I once fitted a model to with data from ~9000 individual participants, each had several parameters (at least 3, I think). You just have to put them all in the same vector and use matrix algebra to map them to the correct values in your dependent variables.
just it would be slightly simpler if I could do that with only one single-participant set of models.
Perhaps I’m wrong about this, but if the model is non-hierarchical and one participant’s parameters are independent from another participant’s parameters then fitting one big model is equivalent to fitting many smaller models. I think that WAIC/LOO values are given elementwise (e.g., per trial in an experiment). I had always assumed that from the big model you can get the WAIC/LOO for the smaller models simply by summing the elementwise values over the appropriate individual elements, but perhaps this is not true?
Obviously, this is not true if the big model is hierarchical, because due to partial pooling you will end up with different parameter estimates. However I think it is still theoretically valid to carve up the big model; you simply used to prior information to get better parameter estimates in the individual models.