Thanks so much for your help here and for sharing the notebook. I’ve implemented your approach - build two nearly identical models, differing primarily in the way X is defined, rather than a single model with a theano shared variable to replace the input data. It seems to do the trick though I’m not sure I fully understand why it works.
Specifically:
-
In the initial model X is an observed variable. Why do you not need to define X as an observed variable in the second model used for prediction?
-
What is the purpose of tt.squeeze? Why could I not redefine X the same way it is defined in the first model:
` X_modeled = pm.Normal(‘X’, mu=Xmu, sd=1, observed=X_new)? Does this some how replace the original mask used in the first model during training by the new mask used for prediction? -
Is there any way to see the X_missing estimates generated in the second model when running ppc? This would help me to know that the mask is working correctly.
Again, thanks so much.