NUTS uses all cores

There are three reasons why NUTS and HMC would use several cores:

  • Some theano ops use BLAS, which will usually be multithreaded. There are several implementations of BLAS, and which one we use depends on which one numpy uses (you can check with np.__config__.show()). If you are using MKL, you can control the number of threads by setting the environment variable MKL_NUM_THREADS. The same variable should also work for openblas. If you are using atlas, then you are out of luck, as that one must be configured at compile time.
  • Some theano ops use openmp explicitly. You could switch that off entirely by setting a config option in ~/.theanorc: http://deeplearning.net/software/theano/library/config.html#config.openmp. And you can control the number of threads using OMP_NUM_THREADS.
  • By default we use multiprocessing to parallelize several chains. You can control the number of cores we use there by setting the cores kwargs in pm.sample(cores=4). So the total number of cores you might be using at max is cores * max(MKL_NUM_THREADS, OMP_NUM_THREADS).
4 Likes