Sample with multiple cores

TYLim · September 10, 2020, 2:07pm

PyMC novice here. I have a fairly simple (I think) model with 4 RVs, two predictor variables, about 200 observations:

with pm.Model() as mod:
    
    alpha = pm.Exponential("alpha_0", 1.0)
    beta = pm.Lognormal("beta", 1.0, 1.0)
    nba = pm.Lognormal("nba", 1.0, 1.0)
    
    e_dn = dn
    
    inf_exp = beta * pm.math.exp(-alpha * dn) * roll
    
    inf_obs = pm.NegativeBinomial("inf_obs", mu=inf_exp, alpha=nba, observed=cases)

(This is the basic version of a more complex model I would like to run, so if there are potential scaling issues for a larger model I would like to know too!)

When I try to run sample with:

with mod:
    trace = pm.sample(100000, tune=50000, cores=ncores)

If ncores = 1, it runs okay
If 1 < ncores <= 8, it takes increasingly long to initialise, and sometimes eventually runs
If 8 < ncores, it basically never finishes initialising and I have to restart the Jupyter kernel

I’m running on a research computing cluster which is a Windows 10 Enterprise virtual machine with 48 cores. I need to run many (~200) separate iterations of this model, or ideally a more complex version of it, so would like to take full advantage of available computing resources to run it as fast as possible.

Based on these other answers it seems there’s no straightforward way on Windows 10 to parallelise running sample even with lots of cores? Any suggestions or solutions welcome

Elenchus · September 10, 2020, 2:24pm

Apparently there have been a few issues with parallel processing on Windows recently. See here and maybe confirm you’re using the latest version

TYLim · September 10, 2020, 3:15pm

Just updated to v3.9.3 from v3.8, but still having the same issue…

FWIW I’m not getting any runtime errors or anything. Instead I get:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...

And then after a bit longer (couple minutes usually):

Multiprocess sampling (32 chains in 32 jobs)
NUTS: [nba, beta, alpha_0]

And then… nothing happens.

TYLim · September 10, 2020, 9:09pm

With further testing, seems like I can more or less consistently get it to run with low numbers of cores (ncores <= 8, sometimes <= 16).

With that said, is there maybe some workaround way to efficiently use the available cores to run this? Could I for instance use multiprocessing as part of my overall workflow to try to run sample multiple times concurrently on separate iterations of the model, or…?

Topic		Replies	Views
Pm.sample gets stuck after init with cores > 1 Questions	17	3957	January 4, 2021
Sampling hangs with multiple cores Questions	5	4048	May 21, 2020
Sample runtimeerror when cores>1 Questions bug	12	2207	August 14, 2024
Can not use more than 1 core Questions	1	1572	March 16, 2021
Cores not optimally used version agnostic bug	16	106	November 26, 2024

Sample with multiple cores

Related topics