How to save my trace in the system?

EduardoCabria · April 11, 2021, 6:46pm

Hi community,

My goal is to save my trace after pm.sample in order to (after inference, and after closing the jupyter notebook where I’m working with my pymc3 model) not to return to run it again and save it and keep it on my computer and then load it again. But how?

I am trying to infer models that take too long and I don’t want every time I log into my jupyter I have to run them again.

In the PyMC3 documentation I can’t find anything. Thanks.

OriolAbril · April 11, 2021, 6:53pm

You can save your results as a netCDF file using ArviZ:

with model:
    idata = pm.sample(return_inferencedata=True)
idata.to_netcdf("filename.nc")

then load the results with:

idata = az.from_netcdf("filename.nc")

It is also possible to use zarr instead, which is still a very new library and may fail in some cases (haven’t tried it much myself yet) but has some advantages over netCDF

EduardoCabria · April 12, 2021, 7:15am

Thankyou @OriolAbril I will try this solution

EduardoCabria · April 12, 2021, 5:43pm

Hi Oriol,
I dowloaded Netcdf from conda to use in my jupyter from Netcdf4 :: Anaconda.org but I still got this error:

What am I doing wrong? Maybe it is because I have not included return_inferencedata=True in the pm.sample ?

Thankyou mate.

OriolAbril · April 12, 2021, 6:09pm

Yes, the current backend is pm.MultiTrace instead of being az.InferenceData, so for now you need to include this. Our goal is to make return_inferencedata=True the default with pymc3 4.0.

Useful links on the topic:

Introduce `BestPracticeWarning` to help users with their learning curve · Issue #4638 · pymc-devs/pymc3 · GitHub
Make `pm.sample` return `InferenceData` by default · Issue #4372 · pymc-devs/pymc3 · GitHub

EduardoCabria · April 12, 2021, 7:05pm

Thankyou so much @OriolAbril
Sorry for being so annoying, but this is my last doubt. I never worked with Arviz, how can I change what I used before (trace["blablabla"]) to the new version in Arviz? (the problem is attached)

OriolAbril · April 12, 2021, 9:35pm

I generally recommend using

    ...
    idata = pm.sample(...
trace = idata.posterior  # optionally add .stack(sample=("chain", "draw"))

trace is then an xarray.Dataset which has a lot of features, you can select variables with ["var-name"], compute means, quantiles, plot the variables…

Here are some docs that showcase InferenceData capabilities:

A Primer on Bayesian Methods for Multilevel Modeling — PyMC3 3.11.2 documentation
Rugby analytics case study (docs will be updated with these changes with next release)
Introduction to xarray, InferenceData, and NetCDF for ArviZ — ArviZ dev documentation
Working with InferenceData — ArviZ dev documentation

Topic		Replies	Views
Saving and loading traces in PYMC4 v5	2	1174	January 21, 2023
How to save trace in between sampling and then resume sampling later on? version agnostic modeling , sampling , arviz	0	32	April 1, 2025
Saving model and inference results (traces?) in pyMC4 v5	5	2314	July 18, 2022
AttributeError: module 'pymc' has no attribute 'save_trace' when trying to save trace v5 bug	6	1053	October 7, 2022
Unable to load trace after VI Questions variational_inferenc	14	1723	November 24, 2021

How to save my trace in the system?

Related topics