I am running a pymc3 model multiple times in a loop to estimate posterior distributions, every iteration I change the data using pm.set_data
with model:
pm.set_data({"gw_pump_semi": pump,
"gw_pump_semi_lag": pump_lag,
"id_wtr_yr_lag": [wtr_yr_lag]*2,
"id_wtr_yr": [wtr_yr]*2})
p_post = pm.fast_sample_posterior_predictive(trace=gwtrace,samples=400, random_seed=800,var_names=["depth_like"])["depth_like"]
Before I start the parallel computing process I define the pm.model() as model and load the trace that I estimated beforehand. Then each process in parallel calls the model and uses the trace to use pm.fast_sample_posterior_predictive. It works perfectly, however after a a couple of hundred of iterations it gets slower and eventually stops. I was thinking it was a memory leak and I tried solving it using the suggestions in: https://github.com/pymc-devs/pymc/issues/1959 using a multiprocessing in the function
But this is still happening, I really need help! Is for my PhD research and I am trying to run this function close to a million times.
Pymc3 = 3.11.2
theano-pymc=1.1.2
python 3.9.7
Installed using conda install -c conda-forge pymc3 theano-pymc mkl mkl-service
The cluster has linux