Hi pymc community,
I have a latent variable mixture model, where a number of rater’s judgments on a particular trail are assigned as either “real” or “contaminate”. The number of possible “real” judgments (z) is latent categorical variable inferred during sampling.
After I have fit the model, and generated a trace, I would like to sample from the posterior predictive BUT only after fixing z to its mode. i.e. i would like to convert z to an observed variable.
Whats the best way to do this? A few ways that might work are:
-
Slice the trace down to just the samples where z=my_dersired_values, and enter it into
sample_ppc
as normal. -
Another way would be to build another, almost identical model, with z as observed this time, fit it, then sample_ppc from it.
-
Use sample_ppc as normal, without z observed, and then remove the posterior predictive samples where z != my_desired_values. This seems inefficient, as a lot of PP samples are not meaningful.
Edit, now that I think on the above options. 1 and 3 are only sensible if the variable z is discrete (which it is), but it would be nice to know what the general solution is when a variable that was not observed, becomes observed. Maybe I would have to specify all “possibly observed” variables in the original model, and just feed missing nan values during inital sampling.