How to run PyMC3 in a multi-node cluster? Is it possible at the moment?

jithendaraa · July 27, 2020, 2:26pm

Hey, I am trying to do something similar – Segmentation fault (core dumped) on running pm.sample in Ubuntu.
What is the most efficient way to do this as of now (especially for large data)? I have access to a HPC environment but not sure if I’m using it to its fullest with PyMC3. When I run a pymc3 program on 6 nodes, only one of the nodes is full at 100% cpu and the other nodes are more or less idle as checked with the seff command. Am I not doing it right or is this the expected behaviour?
From this discussion, here’s my take away:

Right now, for PyMC3, there is no support for computing on multiple nodes (though there is scope to do so) so increasing number of nodes won’t help as of now.
Increasing number of cores does help/speed up the sampling process.
Best solution for now: Use only a single node but try to get max. number of cores on it.

Can someone confirm if my take away is right?
Is there a better “best solution for now”?
My model is a probabilistic DAG but it also follows time series and I want to do bayesian inference with it. Any alternatives to PyMC3 that might be faster for my use case? How about Stan/PyStan, is it faster?

Thanks in advance for any help!

Topic		Replies	Views
Running Pymc in clusters - CPU problems v5	2	333	September 20, 2023
Nested Parallel in PyMC3 Questions	5	1018	December 6, 2020
Alternative Computation Backends for PyMC PyMC4	5	2292	June 7, 2018
Pymc3 GPU on windows and on linux Questions gpu	6	3626	August 1, 2019
Multilevel Monte Carlo methods in PyMC3 Questions	0	378	May 7, 2020

How to run PyMC3 in a multi-node cluster? Is it possible at the moment?

Related topics