Thanks, I will try later.
I do have a question, when I use the example, the CPU version runs faster than GPU.
So does this mean GPU actually not helps in Pymc3 case.
If we can have benchmarks for different examples, it might be helpful for user to choose whether enable CPU or GPU. BTW I use 1080TI.
Another finding is when I add observed data size in my own example,
After the NUTS sampler initialization stage, the speed drops very quickly. Is it a bug, or I did something wrong.
Auto-assigning NUTS sampler…
Initializing NUTS using ADVI…
Average Loss = -20,169: 65%|██████▌ | 13089/20000 [00:49<00:24, 285.52it/s]
Convergence archived at 13100
Interrupted at 13,100 [65%]: Average Loss = -1,186.2
1%|▏ | 33/2500 [00:13<57:04, 1.39s/it]