Number of cores in variational inference interface

matticand · June 19, 2020, 3:31pm

I’m working on accelerating the fitting of my hierarchical model by augmenting the number of CPU cores utilized. As I understand, PyMC3 utilizes half of the cores by default (which is also my experience on different computers at differing number of CPU cores).

In contrast to pm.sample(), there is no cores-parameter in pm.fit().
As the VI interface of PyMC3 uses scipy for the optimization, I tried to augment the number by setting some environment variables to the maximum number of cores:

os.environ[‘MKL_NUM_THREADS’] = ‘20’
os.environ[‘OPENBLAS_NUM_THREADS’] = ‘20’
os.environ[‘OMP_NUM_THREADS’] = ‘20’
os.environ[‘NUMEXPR_NUM_THREADS’] = ‘20’

Nonetheless, the computation still only utilizes half of the cores. However, by setting the environment variables to a number smaller than half of the cores, the number of actually utilized cores is limited, which works fine.

Is there a way to augment the number of cores utilized in the PyMC3 VI interface? This would be of great help.

ferrine · June 20, 2020, 6:46pm

Hi! PyMC3 does not use scipy all over the place. In VI, we use gradient descent implemented in Theano. And it’s theano who decide on number of cores used. Usually, it is the same as numpy does. ENV variables look reasonable but this is probably not the bottleneck of your model.

For performance gains I may suggest trying minibatches, this might help:) It will not increase number of cores used, but iterations will be faster (you might also want to change the learning rate)

matticand · October 13, 2020, 9:59am

Hey, thanks for your reply and sorry for my late response. The minibatches are a very helpful feature.

Anyways, I still made some research on the control of which and how many cores will be used. I figured out that it depends very much on the model, in some cases pymc3 even utilizes all 20 cores with the following setting:

os.environ[‘MKL_NUM_THREADS’] = ‘20’
os.environ[‘OMP_NUM_THREADS’] = ‘20’
os.environ[‘openmp’] = ‘True’

However, by using openMP there is even more possible than only setting the number of cores. For everybody who is interested in controling the CPU and core usage, some very helpful slides can be found here (especially slides 17-22): https://www.ixpug.org/documents/1506981937ixpugfall2017_21_up2.pdf

luisroque · October 4, 2021, 5:36pm

Hi,

I am able to control the number of cores with:

os.environ['MKL_NUM_THREADS'] = '4'
os.environ['OMP_NUM_THREADS'] = '4'
os.environ['GOTO_NUM_THREADS'] = '4'

Nevertheless, I am experiencing some strange behaviors. I have a machine with 56 cores and my model takes 52h to run with all cores used. With 8 cores it takes ~22h, with 4 cores ~18h. I was expecting to see a reduction of the wall time with more cores used. This seems strange. Any recommendation?

Topic		Replies	Views
Cores not optimally used version agnostic bug	16	114	November 26, 2024
Limiting the number of cores/threads used in PyMC5.6+	14	880	December 6, 2024
New machine does not use more than 1 core for linear algebra, unresponsive to changing env variables Questions	4	650	February 12, 2021
PYMC3 Multiprocessing issue on a kubernetes cloud Questions	4	1205	April 9, 2020
Number of cores settings in sampling Questions	3	677	October 30, 2021

Number of cores in variational inference interface

Related topics