Held-out prediction with latent variables whose dimension changes with the test set


I have a hierarchical regression model which predicts several responses for each user. In prediction, I want to held-out users and predict all these response. The current model includes an independent latent variable per user. Therefore, prediction in the test set contains a different number of latent variables than prediction in training.

Which is the best way to run this prediction in pymc3? I’ve opted to build a second model for prediction in test with as many latent variables as users participants in the test set. For the rest of variables, I have used the mean of their samples, but I would ideally like to sample them as well.


Another solution I’ve taken is to build the model based on all data and threat held-out data as missing. This solution allows to use the posterior samples of the training variables (not only the mean) to predict the held-out data, but what I don’t like, is that I cannot split training and prediction.