Workflow of refining a model: how to store a history of outputs of the sampler

TheaBehrens · November 1, 2022, 4:45pm

I am developing a model which runs several hours, so I do not want to run the same one several times. I am playing around with different parametrizations (e.g. Gamma distribution with alpha and beta or with mu and sigma as parameters) and also different values for the pm.sample() method (e.g. the target_accept parameter).
I am using jupyter notebooks for the coding and plotting, which works well for me. However, I find that sometimes I would love to have a history of previous versions of the model together with the output of the sampler (i.e. the message of how long sampling took, possibly the number of divergences as well as maximum_tree_depth. At the moment I try to keep this info in a separate text file. Often I forget to put the info there and that information is lost once I restart the notebook.

Does anyone have a workflow for this kind of model tweaking? Do you have recommendations of how to automatically save the model specification and the output of the sampler to have a record of what worked well and what didn’t?

What I already do (which saves me a lot of time!) is to always try to load a previously computed trace and only sample it anew in case it does not yet exist:

name = "my_model_trace_name.az"
try:
    trace = az.from_netcdf(name)
except:
    with model:
        trace = pm.sample(tune=1000, draws=1000, target_accept=0.95)
    trace.to_netcdf(name)

fonnesbeck · November 3, 2022, 1:09pm

Hi Thea. MLFlow might be useful for what you are trying to do.

TheaBehrens · November 3, 2022, 4:30pm

thanks, I’ll take a look!

Topic		Replies	Views
How to save trace in between sampling and then resume sampling later on? version agnostic modeling , sampling , arviz	2	80	July 10, 2025
Complaint Monday - What has been bothering you about PyMC? Development development	7	614	June 19, 2023
Sampling from dataset Questions	2	382	May 7, 2020
Restarting sampling from stored multitrace: mixed sampling and tuning Questions	2	575	October 5, 2020
Serializing models v5	2	552	May 17, 2025

Workflow of refining a model: how to store a history of outputs of the sampler

Related topics