Hello,
Is there a difference or benefit for using an entire trace from an MCMC versus taking samples from the trace first (using pm.sample_posterior_predictive) to make inferences?
Hello,
Is there a difference or benefit for using an entire trace from an MCMC versus taking samples from the trace first (using pm.sample_posterior_predictive) to make inferences?
Iām not sure I quite understand your question given the way itās phrased, so please take this as a first attempt to clarify:
draws
) is your choice.pm.sample
is init=jitter+adapt_diag
see here: Inference ā PyMC3 3.11.2 documentation. The sampler has to explore from this initialisation towards the region of higher density, and these ātuningā aka āburn-inā samples are usually discarded and not used for anything downstream.arviz.plot_posterior
arviz.plot_posterior ā ArviZ dev documentation - for this you need samples from the well mixed, stable region of the sampling, and you simply need a trace with enough samples to yield the precision you want to quote. E.g. a trace of length 1000 samples lets you quote a point on that distribution to 3 decimal places aka 0.001 aka 0.1%. You can set this using pm.sample(draws=1000 ...
.pm.sample_posterior_predictive
is quite different and uses the traces to probabilistically generate new synthetic data. For this you can use a trace of any length, but in practice it makes very good sense to use the same trace (per #4 above) that you use for inference.pm.sample_posterior_predictive
does give you the option to select a different number of samples from the trace: i.e. to undersample or even oversample a set of samples from the full trace. However per the documentation (see below), the pymc3 devs recommendation is to only set this number differently if you have specific reason, otherwise you should accept the default:**samples** int
Number of posterior predictive samples to generate. Defaults to one posterior predictive sample per posterior sample, that is, the number of draws times the number of chains. It is not recommended to modify this value; when modified, some chains may not be represented in the posterior predictive sample.
Does this help?