Hey, I am trying to do something similar – Segmentation fault (core dumped) on running pm.sample in Ubuntu.
What is the most efficient way to do this as of now (especially for large data)? I have access to a HPC environment but not sure if I’m using it to its fullest with PyMC3. When I run a pymc3 program on 6 nodes, only one of the nodes is full at 100% cpu and the other nodes are more or less idle as checked with the seff
command. Am I not doing it right or is this the expected behaviour?
From this discussion, here’s my take away:
- Right now, for PyMC3, there is no support for computing on multiple nodes (though there is scope to do so) so increasing number of nodes won’t help as of now.
- Increasing number of cores does help/speed up the sampling process.
- Best solution for now: Use only a single node but try to get max. number of cores on it.
Can someone confirm if my take away is right?
Is there a better “best solution for now”?
My model is a probabilistic DAG but it also follows time series and I want to do bayesian inference with it. Any alternatives to PyMC3 that might be faster for my use case? How about Stan/PyStan, is it faster?
Thanks in advance for any help!