I have question about the number of samples generated by sample_posterior_predictive.
Suppose I have fitted my model and obtained a posterior trace with 4 chains with 1000 samples each (= 4000 samples from the posterior in total). Now, I would like to make posterior predictions and compare them to my observations. Specifically, I’d like to create a plot as used in the Principled Bayesian Workflow by Betancourt: https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html#step_fourteen:_posterior_retrodictive_checks65
To do that, I need to create multiple samples for my observed variable for each of the posterior samples. For instance, I’d like to create 100 samples for each of the 4000 samples from the posterior.
Now I’m a bit puzzled by the samples parameter of the sample_posterior_predictive function:
- The documentation says that “It is not recommended to modify this value; when modified, some chains may not be represented in the posterior predictive sample.”. I don’t understand that statement. To me, it seems straightforward to say that I’d like to sample e.g. 100 samples for each posterior sample. Why do we risk not using some chains?
- When I e.g. set
samples=8000, I am getting a posterior prediction of shape (8000,). However, I would like to get shape of (4000, 2), indicating that in this case I took 2 samples of each posterior sample. Ideally, I would expect the function to except something likesamples_per_posterior_sample=2.
Am I misunderstanding something here?