Is there a way to generate synthetic data sets using PYMC where the synthetic data would capture the correct relationships and distributions of the model and the original data set? thanks. Paul

Hi Paul!

`pm.sample_posterior_predictive`

generates artificial (observed) data, given the model and covariates; this might be what you want? Thereâ€™s an example notebook showing how they are used here

Welcome!

Iâ€™m not sure what the â€ścorrectâ€ť relationships/distribution of both the model *and* the data might be. You can generate posterior predictive samples, which are draws from a posterior used to generate credible (synthetic) data. Or, if you want to investigate the dependencies among model parameters (ignoring observed data), you can sample from your model without including any of the observed variables. That should yield a MCMC trace that includes draws from your posterior that are then pushed through the rest of your model:

```
with pm.Model() as model:
a = pm.Gamma("a", alpha=1, beta=1)
b = pm.Normal("b", mu=a, sigma=1)
c = pm.StudentT("c", mu=b, sigma=1, nu=3)
idata = pm.sample(10)
print(idata.posterior)
```

yields:

```
Coordinates:
* chain (chain) int64 0 1 2 3
* draw (draw) int64 0 1 2 3 4 5 6 7 8 9
Data variables:
b (chain, draw) float64 1.368 1.93 1.213 3.031 ... 1.81 0.6625 0.2195
c (chain, draw) float64 3.798 3.018 4.84 5.91 ... 1.844 2.292 -0.6863
a (chain, draw) float64 0.4783 0.5831 0.8971 ... 0.2333 0.7063 0.4604
Attributes:
created_at: 2022-07-15T01:55:13.283567
arviz_version: 0.12.1
inference_library: pymc
inference_library_version: 4.1.2
sampling_time: 0.702103853225708
tuning_steps: 1000
```

Are either of those what you are looking for?

thank you

Paul

Yes, this is what I was looking for.

thanks

Paul