Thanks @ricardoV94 does that mean it is in general discouraged to use deterministic variables (for models with many groups and or observations) because they can result in a huge memory footprint?
How would one then go about re-computing e.g. the mean of mu_, when generating predictions on unseen data? Make a dedicated model for predictions, akin to what’s described here Out of model predictions with PyMC - PyMC Labs?
Ideally I’d like to fix the definition of those deterministic variables when I specify the model for sampling (“fitting to training data”). But I do not want to necessarily store them during posterior sampling. Then when I generate predictions on unseen data I can simply “activate” them in order to compute the mean of mu_ without having to define a new model.
Having to omit those variables just in order to “reduce the memory footprint” of my traces also leads to harder to read code IMHO, e.g. the likelihood definition becomes
pm.Normal("y_obs", mu=alpha + omega * epsilon_alpha_g[g] + (beta + tau * epsilon_beta_g[g]) * x + sigma=sigma)
While this looks still comprehensible I am afraid I would quickly loose sight of things once I have a more complicated multi-level hierarchy and/or e.g. an errors in variables models etc.
I haven’t found an easy way to swap in a backend that directly stores the trace to disk, the default (numpy-based) backend holds all variables in memory as far as I understand, is that correct?
Thanks a lot for the great help and support.