When I enable CPU, it shows this error:
/py36env/lib/python3.6/site-packages/theano/compile/function_module.py in call(self, *args, **kwargs)
883 outputs =
–> 884 self.fn() if output_subset is None else
885 self.fn(output_subset=output_subset)
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
in ()
1 with model:
----> 2 trace = pm.sample(2000, burn=1000)
When I remove “floatX = float32” from theano configuration file, it runs okay in cpu setting.
In gpu, it still crashes. When I update to theano 0.10dev, without “floatX = float32”, it works with GPU.
Thanks for point out try different configurations.
Thanks, I will try later.
I do have a question, when I use the example, the CPU version runs faster than GPU.
So does this mean GPU actually not helps in Pymc3 case.
If we can have benchmarks for different examples, it might be helpful for user to choose whether enable CPU or GPU. BTW I use 1080TI.
Another finding is when I add observed data size in my own example,
After the NUTS sampler initialization stage, the speed drops very quickly. Is it a bug, or I did something wrong.
Auto-assigning NUTS sampler…
Initializing NUTS using ADVI…
Average Loss = -20,169: 65%|██████▌ | 13089/20000 [00:49<00:24, 285.52it/s]
Convergence archived at 13100
Interrupted at 13,100 [65%]: Average Loss = -1,186.2
1%|▏ | 33/2500 [00:13<57:04, 1.39s/it]
GPU is still experimental as we haven’t yet fixed all the issues. It’s thus possible that there is a lot of data communication during sampling which would slow things down. You could try https://github.com/pymc-devs/pymc3/pull/2345 which might give you better speeds.
NUTS slowing down could mean that initialization failed and once you get to the typical set the mass matrix is actually not well tuned. Here you could try https://github.com/pymc-devs/pymc3/pull/2327 which might do a better job.