Pymc3.2/theano crashes


I have twice experienced pymc crash today while doing two concurrent runs (on CPU). One process just dies, the other shows the following messages:

WARNING (theano.gof.compilelock): Overriding existing lock by dead process '5664' (I am process '4040')
WARNING (theano.gof.cmodule): Deleting (broken cache directory [EOF]): C:\Users\etyurin\AppData\Local\Theano\compiledir_Windows-7-6.1.7601-SP1-Intel64_Family_6_Model_79_Stepping_1_GenuineIntel-3.6.3-64

This shouldn’t be a resource exhaustion, because two runs together occupy about 2/3 of the cores and 1/2 of the RAM. Here are my package anaconda versions:

theano                    0.9.0                    py36_0
pymc3                     3.2              py36ha0754d1_0
python                    3.6.3                h210ce5f_2

Any ideas/suggestions?

Did you try deleting the theano cache folder (C:\Users\etyurin\AppData\Local\Theano)?

I never deleted it by hand. I’m still waiting for the second run to complete, then I can do that.

Should I make it a regular procedure to delete the cache manually?

To be precise, each theano/pymc3 “mini-run” takes about 7-8min, but I’m running dozens of them in a grid.

Sometimes there is residual issue after pymc3/theano crashes; I usually clear the cache when that happens.