Bayesian Data Production Operations


Has anyone put a PyMC model into production and actually saved the samples for later analysis into a database? If so, what is the best database design to save posterior samples?

1 Like

CC @michaelosthege

Hi @jordan.howell2,

at the moment most people fall back to InferenceData.to_netcdf and managig traces on a filesystem level (e.g. S3), but you can also use a real database with the mcbackend.ClickHouseBackend as shown here.
This enables live access to the draws while the sampler is still running.

Long term Iā€™m working towards switching the PyMC internals to use mcbackend, so any contributions are welcome!