Bayesian Data Production Operations

jordan.howell2 · October 27, 2022, 5:08pm

Hello,

Has anyone put a PyMC model into production and actually saved the samples for later analysis into a database? If so, what is the best database design to save posterior samples?

ricardoV94 · October 29, 2022, 12:00pm

CC @michaelosthege

michaelosthege · October 30, 2022, 12:48pm

Hi @jordan.howell2,

at the moment most people fall back to InferenceData.to_netcdf and managig traces on a filesystem level (e.g. S3), but you can also use a real database with the mcbackend.ClickHouseBackend as shown here.
This enables live access to the draws while the sampler is still running.

Long term I’m working towards switching the PyMC internals to use mcbackend, so any contributions are welcome!

Topic		Replies	Views
Save posterior samples to backend rather than holding in RAM? v5	3	326	October 9, 2023
Saving intermediate results using MCMC in pyMC4 v5	9	1569	August 8, 2022
Using mcbackend to store samples v5 modeling	4	390	July 22, 2024
Using mcbackend to store samples from Blackjax sampler v5	2	56	December 23, 2024
Complaint Monday - What has been bothering you about PyMC? Development development	7	614	June 19, 2023

Bayesian Data Production Operations

Related topics