Workflow of refining a model: how to store a history of outputs of the sampler

I am developing a model which runs several hours, so I do not want to run the same one several times. I am playing around with different parametrizations (e.g. Gamma distribution with alpha and beta or with mu and sigma as parameters) and also different values for the pm.sample() method (e.g. the target_accept parameter).
I am using jupyter notebooks for the coding and plotting, which works well for me. However, I find that sometimes I would love to have a history of previous versions of the model together with the output of the sampler (i.e. the message of how long sampling took, possibly the number of divergences as well as maximum_tree_depth. At the moment I try to keep this info in a separate text file. Often I forget to put the info there and that information is lost once I restart the notebook.

Does anyone have a workflow for this kind of model tweaking? Do you have recommendations of how to automatically save the model specification and the output of the sampler to have a record of what worked well and what didn’t?

What I already do (which saves me a lot of time!) is to always try to load a previously computed trace and only sample it anew in case it does not yet exist:

name = "my_model_trace_name.az"
try:
    trace = az.from_netcdf(name)
except:
    with model:
        trace = pm.sample(tune=1000, draws=1000, target_accept=0.95)
    trace.to_netcdf(name)

Hi Thea. MLFlow might be useful for what you are trying to do.

thanks, I’ll take a look! :slight_smile: