I’m trying to speed up computation of a larger model and I’m wondering if I could run these in parallel?
For context, the model itself is a hierarchical model with several levels (Warehouse > Product > SKU) that already used broadcasting (via shape). So, to be able to process this for 52 weeks and 3 planning scenarios, I run this in a larger loop. Instead of a serial loop, are there any concerns running this in a parallel loop?
I’m not super familiar with theano, so I don’t know if this could cause hard-to-detect bugs if theano doesn’t keep the parallel loops neatly separated? I hope this makes sense.
Here’s kinda what I’m thinking:
import multiprocessing
from joblib import Parallel, parallel_backend, delayed
def run_model(data, epoch):
model = get_model(data)
trace = process_model(model, epoch)
return trace
def main_parallel(epochs, n_jobs):
num_cores = multiprocessing.cpu_count()
backend = parallel_backend('multiprocessing')
with backend:
traces = Parallel(n_jobs=n_jobs, verbose=10)(
delayed(run_model)(data, epoch)
for epoch in range(0, epochs, 1)
)