How save PyMC v5 models?

Galen_Seilis · September 29, 2023, 6:30pm

What is a recommended v5 way to save a PyMC model object?

I looked at pickle but got AttributeError: Can't pickle local object '_make_nice_attr_error.<locals>.fn'.

Maybe dill would work, but I wanted to see what people generally do in the current state.

twiecki · September 29, 2023, 9:40pm

Here’s a tutorial: Using ModelBuilder class for deploying PyMC models — PyMC example gallery

Kraftfaust · October 4, 2023, 9:42pm

The ModelBuilder class is clearly the way to go, but if you’re looking for a quick and dirty solution, I’ve been wrapping my trace and model inside a python dict and saving it as a pickle.

import pickle
import cloudpickle

pickle_filepath = f'path/to/pickle.pkl'
dict_to_save = {'model': model_name,
                'idata': idata,
                'recovery_dict':z_score_recovery_dict,
                }

with open(pickle_filepath , 'wb') as buff:
    cloudpickle.dump(dict_to_save, buff)

Then the load would be :

pickle_filepath = f'path/to/pickle.pkl'
with open(pickle_filepath , 'rb') as buff:
    model_dict = cloudpickle.load(buff)

idata = model_dict['idata']
model = model_dict['model']

with model:
    ppc_logit = pm.sample_posterior_predictive(idata )

I’ve had issues saving NetCDF files on Databricks and as long as you keep the pickle version consistent you should be ok.

jewelltaylor · November 18, 2023, 4:41pm

@twiecki What about the case of model checkpointing? I am working on a compute cluster where I may get pre-empted after a certain amount of time. Is there anyway I can save the model at set intervals with this workflow to be loaded and continue sampling where I left off?

twiecki · November 18, 2023, 7:09pm

Currently that’s not supported. You could just sample 100 samples each, then save, then continue etc. There’s also GitHub - pymc-devs/mcbackend: A backend for storing MCMC draws. by @michaelosthege to save traces on another machine.

jewelltaylor · November 18, 2023, 10:00pm

@twiecki That makes sense, I really appreciate your response! I assume the model sampled 200 times is roughly equivalent to a model that has been sampled 100 times saved, loaded and sampled 100 more times

Maren · July 1, 2025, 12:26pm

Hello Kraftfaust

We also ran into file size constrains in databricks while saving the .nc files.
As the Tread is 2 years old, have you found a better solution to save and load the model to mlflow in databricks?

Best wishes

Topic		Replies	Views
Saving model and inference results (traces?) in pyMC4 v5	5	2331	July 18, 2022
Simple question: correct way to save traces? Questions	23	9389	October 4, 2021
AttributeError: module 'pymc' has no attribute 'save_trace' when trying to save trace v5 bug	6	1057	October 7, 2022
Complaint Monday - What has been bothering you about PyMC? Development development	7	603	June 19, 2023
Pickling trace object Questions	6	2295	September 16, 2018

How save PyMC v5 models?

Related topics