Hello,
I’m kinda new to pyMC, and so far I’ve been able to use it a do some Bayesian fitting with it.
In my case I have a simple model:
with basic_model:
# Priors for unknown model parameters
a = pm.Uniform("a", 0, 10)
alpha = pm.Uniform("alpha", 0, 5)
b = pm.Uniform("b", 0, 3000)
# Model of the spectral density
Sj = a*frequency**(-alpha) + b
# Likelihood
likelihood = pm.math.prod(pm.Exponential("likelihood", scale=Sj, observed=periodogram[:,i,j]), keepdims=True)
trace = pm.sample(4000, tune=400, cores=1, chains=4, progressbar=False, nuts_sampler='blackjax')
where:
frequency = np.linspace(1.2e-5, 1/120, 2000)
periodogram is an array with shape (2000, 2048, 2048)
Basically, I’m fitting a simple power-law model (Sj) to my 2048*2048 periodograms (in reality I only do it for like 2.5M points, not all of them).
With nutpie (cpu) or the gpu back-ends I have a quite fast sampling, so I think I need to do some parallelization.
On cpu I tried the multiprocessing library and works fine but I don’t have access to that many cpu cores to really speed things up (I would like to get this huge arrays done in less than half a day if possible, right now even with cpu multiprocessing it takes like 3 days). On the other hand, I have access to 2 NVIDIA RTX A6000 gpus, is there a native way to perform parallelization of the calculations with them with PyMC?
The question comes from seeing many people using gpu for huge datasets, but in my case I have a lot of small independent data, that need a different model fitting. So I wonder if gpu parallelization is doable.
I’m happy to provide any additional information and to follow any advise you will have.
Thanks!