Saving and Loading GP model in PYMC3

Sudipta · August 24, 2018, 7:07pm

Is there a way we can save and load GP model in PYMC3? Per https://media.readthedocs.org/pdf/pymcmc/jss-gp/pymcmc.pdf, it’s almost impossible to save and load models in PYMC3…

junpenglao · August 25, 2018, 6:41am

You can try pickle the model, but the recommended way is to save the trace using pm.save_trace and pm.load_trace: https://github.com/pymc-devs/pymc3/pull/2975. And you re-initial the model everytime (ie, executing the with pm.Model()... block).

Sudipta · August 25, 2018, 3:50pm

Thanks very much…I’m still unclear about one thing
Do you happen have a sample code to predict with this trace load functionality? I have a Gaussian model :
y= f(x) + e…
f(x) ~ Gaussian(a, b),
e ~ N(0, sigma^2)
Trace saves the posterior of a,b and sigma…

My objective is to predict f(x) for a new x in a new python session without running the model training piece. So if I use the “trace load” do I have to load the training sample separately? I sample code will be useful.

junpenglao · August 25, 2018, 7:39pm

You need to rerun the model block, so you would need the training data again. But you dont need to do inference again (ie., not need to call trace = pm.sample(...) but trace = pm.load_trace(...) instead). And the prediction part should be the same in both cases.

gaddamanil16 · September 15, 2018, 8:37pm

Hi @junpenglao,

So I can simply use pm.save_trace and save in a pickle file. Or how do I do it?
I am running my code on a remote server. I am not able to understand how I can get back trace results so that i can analyse on my computer. And even if I do it, will pm.load_trace work without the context.

Help much needed.

Thanks.

dsvolk · March 27, 2019, 12:07pm

You can actually avoid re-running the same model definition code twice. This is what I do:

In my “train” script, after I sampled the model, I put

with open(model_fpath, 'wb') as buff:
    pickle.dump({'model': model, 'trace': trace, 'X_shared': X_shared}, buff)

Here X_shared is the Theano shared tensor of the predictor variables.
Then in my separate “production predict” script I start with

with open(model_fpath, 'rb') as buff:
    data = pickle.load(buff)
model = data['model']
trace = data['trace']
X_shared = data['X_shared']

X_shared.set_value(np.asarray(X, theano.config.floatX))

Here X are my predictor variables in production.
Now I can sample the posterior predictive:

with model:
    post_pred = pm.sample_ppc(trace)

Krishna_Kumar · January 6, 2022, 12:21pm

This works like a charm… Thank you !!

One small change to prediction api. sample_ppc is deprecated. Instead we should usesample_posterior_predictive

post_pred = pm.sample_posterior_predictive(trace_reload, model=model_reload, samples=100)

Topic		Replies	Views
Posterior prediction of pickled model and inferencedata v3	0	481	May 18, 2022
Simple question: correct way to save traces? Questions	23	9390	October 4, 2021
How to save and load trace Questions	10	8169	June 7, 2019
Use saved gaussian model from sci-kit in pymc3? Questions theano	3	1557	November 30, 2017
Loading models in standalone .py files Questions	5	592	January 8, 2019

Saving and Loading GP model in PYMC3

Related topics