How to use parallel computation while computing a single log-likelihood in PyMC? In particular, is there any pymc version of 'reduce_sum' of stan?

Hi PyMC developers,

First, I want to thank the PyMC developers for their excellent package. I have shifted my preference from stan to PyMC for Bayesian computation as it significantly increases the computation speed. Most importantly, it can perform Metropolis sampling for discrete variables and NUTS for continuous variables within a single Bayesian model. It helped me implement Dirichlet Process for a complicated model without marginalization.

You must be aware of the stan function ‘reduce_sum’ (link), which can partition the sample and compute each part parallelly using the cores available and finally combine them to find the likelihood for the whole sample. I found this very helpful in one of my current works, as it involves a very large dataset, and instead of running multiple chains parallelly, I would rather be happy to run a single chain and use parallelization in computing the likelihood.

I shall be very grateful if you kindly give me a reference for any such existing PyMC function or any idea of how to implement such a thing efficiently(like-- if there is a way to use parallel processing within a PyMC code).

Thank you.

There is some parallelization by default in PyMC for blas operations as well as elementwise operations via openmp: Multi cores support in PyTensor — PyTensor dev documentation

There are no tools to give users fine control of computational parallelization, other than via the flags mentioned in that page.

1 Like