Memory leak for GP prediction

I’m interested in fitting a GP once and then using the object to make new predictions many times. The following script shows that the memory footprint of the process grows with every usage of gp.predict.

import numpy as np
import pymc3 as pm
import os, psutil

X = np.random.randn(100, 1)
y = np.random.randn(100, 1)

with pm.Model() as model:
    ls   = pm.Uniform('ls', lower=0.1, upper=4.0) 
    matern   =, ls=ls, active_dims=[0])
    gp =
    ll = gp.marginal_likelihood('ll', X, y, noise=1.0, is_observed=True)
    trace = pm.sample(chains=1, cores=1, tune=2, draws = 2)
    Xnew = np.random.randn(200,1)
    process = psutil.Process(os.getpid())
    for i in range(30):
        _ = gp.predict(X, diag=False)
        print('Memory usage (GB):', process.memory_info().rss/ 1_000_000_000)

Using tracemalloc points to a large amount of expanding memory usage at this part of the Theano codebase but I can’t make heads or tails of what’s going on in there.

Does anyone have an idea for a workaround to reuse the GP for prediction multiple times? I’ve struggled with creating a compiled Theano function from gp.predictt as this generates testval errors.

I can’t be off much help here but you can disable testval errors during theano function compilation.

Something like theano.config.compute_test_value = 'off' iirc

That’s a good piece of advice but in my case that setting appears to be overridden by whatever PyMC3 does when it imports, though that’s a separate issue.

I think I’ve seen this before and hadn’t been able to fix it. Definitely something in theano is weird here. Is it like the theano graph keeps expanding and isn’t overwritten by _?

Yup, there’s no state being kept intentionally. The size of the memory increment is roughly proportional to the size of the graph being built.

After looking through the code some more, I suspect that the issue could be due to repeated memoization of the GP theano prediction function. @brandonwillard Do you think this issue might be solved by Replace custom memoize module with cachetools by brandonwillard · Pull Request #4509 · pymc-devs/pymc3 · GitHub?

The current form of the memoize replacement still uses an unbound cache for methods (i.e. a normal dict). If the method memoization was causing this memory problem, then we can change to a bounded cache. Otherwise, I did replace the other non-method uses of memoization with a bounded LRU cache, so, if those were the cause, the PR should fix it.

I just changed the PR so that it uses an LRU cache for everything.

1 Like

Great! I’ll try this out on the new version and see if it fixes the problem.


So i’m running into the same problem and when predicting my model ends up hording +1TB ram (using pymc3 3.9.3)… Is the potential fix (Pull Request #4509) included in PyMC3 3.11.2 (version from 14 March 2021)?

And can you @ckrapu confirm that the fix works?

Best regards

@Polichinel I would expect that to be the fix. Any reason you can’t simply upgrade and test?

Sound good @twiecki - thanks for a swift reply.

Yes, my model is running on a very large central server shared by a couple of Unis in Denmark. As such i’m discouraged from installing software or update libraries on my own - if it can be avoided. So even small updates likes this should go through an official pip-line.

Thus I just wanted to make sure that I was giving them (the server support) the correct info: that I want the 3.11.2 version. It appears they are already on it, so I’ll update you soon enough about whether the problem is solved :slight_smile:

1 Like

Unfortunately, it appears that the problem still occurs in version 3.11.2, running off the v3 branch. I’ve opened an issue and will be documenting more information there.

Hi @ckrapu

I don’t want to jinx anything but the update does appear to have fixed the problem at my end! Right now my loop is at 1454/10677 iterations and at this point the model used to take up +130gb ram (going towards +1TB ram…). Right now its its taking up 22gb ram - and intriguingly that has not really changed since the loop began. A huge improvement that is

Okay, that’s good to know. I’ve only been running 30 iterations to test and I was seeing increments of ~10 MB each time. Maybe there is some other source of variation I’m unaware of. I originally had issues with a script that I can’t share for work reasons using much larger datasets, so now I can go back and check to see if that’s working better.

1 Like

The loop is done (10677/10677). Everything appear in order and it never went substantially over 22GB ram → compared + 1TB ram before the update. Well done indeed!

Hopefully you’ll get similar results @ckrapu when you run your larger dataset through your model