Are my posterior predictive samples biased if I observe Y?

@OriolAbril Got it, thank you for clearing up the point on the loo function. My reasoning was that I could use the ELPD to get the predicted probability, and this would be analogous to K Fold. Predictions will be used for splitting the data based on the predicted class of 0 or 1 (>=0.5), observations belonging to class 1 will go to a second ML model in production.

Perhaps instead of using the term “unbiased” I should ask this question in terms of target variable leakage. I am dealing with a situation where I can’t yet observe Y for some data. What I want to know is if the samples from sample_posterior_predictive would represent target leakage in the second ML model. If I were to take the outputs from probabilistic models (say mean probability and standard deviation) and use them as features in the production ML model, would training set predictions contain leaked information about Y because we observed it in producing the probabilities.

@Lime, I will look into Jackknife-debiased estimates, but hopefully my clarification helps.