Saving and Loading GP model in PYMC3

Is there a way we can save and load GP model in PYMC3? Per https://media.readthedocs.org/pdf/pymcmc/jss-gp/pymcmc.pdf, it’s almost impossible to save and load models in PYMC3…

You can try pickle the model, but the recommended way is to save the trace using pm.save_trace and pm.load_trace: https://github.com/pymc-devs/pymc3/pull/2975. And you re-initial the model everytime (ie, executing the with pm.Model()... block).

Thanks very much…I’m still unclear about one thing
Do you happen have a sample code to predict with this trace load functionality? I have a Gaussian model :
y= f(x) + e…
f(x) ~ Gaussian(a, b),
e ~ N(0, sigma^2)
Trace saves the posterior of a,b and sigma…

My objective is to predict f(x) for a new x in a new python session without running the model training piece. So if I use the “trace load” do I have to load the training sample separately? I sample code will be useful.

You need to rerun the model block, so you would need the training data again. But you dont need to do inference again (ie., not need to call trace = pm.sample(...) but trace = pm.load_trace(...) instead). And the prediction part should be the same in both cases.

Hi @junpenglao,

So I can simply use pm.save_trace and save in a pickle file. Or how do I do it?
I am running my code on a remote server. I am not able to understand how I can get back trace results so that i can analyse on my computer. And even if I do it, will pm.load_trace work without the context.

Help much needed.

Thanks.

You can actually avoid re-running the same model definition code twice. This is what I do:

In my “train” script, after I sampled the model, I put

with open(model_fpath, 'wb') as buff:
    pickle.dump({'model': model, 'trace': trace, 'X_shared': X_shared}, buff)

Here X_shared is the Theano shared tensor of the predictor variables.
Then in my separate “production predict” script I start with

with open(model_fpath, 'rb') as buff:
    data = pickle.load(buff)
model = data['model']
trace = data['trace']
X_shared = data['X_shared']

X_shared.set_value(np.asarray(X, theano.config.floatX))

Here X are my predictor variables in production.
Now I can sample the posterior predictive:

with model:
    post_pred = pm.sample_ppc(trace)
4 Likes

This works like a charm…:slight_smile: Thank you !!

One small change to prediction api. sample_ppc is deprecated. Instead we should usesample_posterior_predictive

post_pred = pm.sample_posterior_predictive(trace_reload, model=model_reload, samples=100)

1 Like