Core Dump Error when Run GP Regression example

When I try to run the Gaussian Process Regression example,
I get the core dump error and it cause jupyter kernel crash.

I use watermark to output the profile.

The watermark extension is already loaded. To reload it, use:
%reload_ext watermark
CPython 3.6.1
IPython 6.1.0

numpy 1.13.0
theano 0.9.0
pymc3 3.1

compiler : GCC 5.4.1 20170304
system : Linux
release : 4.8.0-58-generic
machine : x86_64
processor : x86_64
CPU cores : 8
interpreter: 64bit

I use GPU enabled theano.

Is there anyone have similar issue like me?


Does it also segfault when you disable GPU?

When I enable CPU, it shows this error:
/py36env/lib/python3.6/site-packages/theano/compile/ in call(self, *args, **kwargs)
883 outputs =
–> 884 self.fn() if output_subset is None else
885 self.fn(output_subset=output_subset)

TypeError: expected type_num 11 (NPY_FLOAT32) got 12

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last)
in ()
1 with model:
----> 2 trace = pm.sample(2000, burn=1000)

When I remove “floatX = float32” from theano configuration file, it runs okay in cpu setting.
In gpu, it still crashes. When I update to theano 0.10dev, without “floatX = float32”, it works with GPU.

Thanks for point out try different configurations.

Did you cast all your input data to float32? You can use pm.floatX(data) to auto-cast.

Thanks, I will try later.
I do have a question, when I use the example, the CPU version runs faster than GPU.
So does this mean GPU actually not helps in Pymc3 case.

If we can have benchmarks for different examples, it might be helpful for user to choose whether enable CPU or GPU. BTW I use 1080TI.

Another finding is when I add observed data size in my own example,
After the NUTS sampler initialization stage, the speed drops very quickly. Is it a bug, or I did something wrong.

Auto-assigning NUTS sampler…
Initializing NUTS using ADVI…
Average Loss = -20,169: 65%|██████▌ | 13089/20000 [00:49<00:24, 285.52it/s]
Convergence archived at 13100
Interrupted at 13,100 [65%]: Average Loss = -1,186.2
1%|▏ | 33/2500 [00:13<57:04, 1.39s/it]

GPU is still experimental as we haven’t yet fixed all the issues. It’s thus possible that there is a lot of data communication during sampling which would slow things down. You could try which might give you better speeds.

NUTS slowing down could mean that initialization failed and once you get to the typical set the mass matrix is actually not well tuned. Here you could try which might do a better job.