I have done some more tests (mostly in 5.9.1, one in 5.9.0) where I change some parameters in the code:
With advi (I did not ran these to completions):
1- nclusters=2 and nsamples=100, the advi init time bar gives an initial estimate ~ 40min.
2- nclusters=2 and nsamples=1000, initial estimate for advi timebar is about 6 hours.
3- nclusters=5 and nsamples=1000, advi starts with an initial estimate of 15 hrs.
So atleast the increase is consistent!
Without advi:
1- nclusters=2 and nsamples=100, sampling starts with an initial estimate of 20 minutes however after about 1 minute, it speeds up quite considerably and finishes at around 6 minutes.
2- nclusters=5 and nsamples=100, sampling gets progressively worse until about 2 hours and then speeds up and finishes at around 33 minutes.
3- nclusters=5 and nsamples=1000, this one starts around at an initial guess of 10 hours, gets progressively worse and then stuck at some point. This was what I had before I stopped it:
pymc version: 5.9.1, random_seed: 1111, nclusters 5, n_samples: 1000
using advi: False
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [w, x_coord, y_coord, sigma]
^CTraceback (most recent call last):----------------------------------------------------------| 2.75% [220/8000 42:21<24:57:59 Sampling 4 chains, 0 divergences]
As a test, I again ran the third option in pymc 5.9.0. It starts of with an estimate of one and a half hours though picks up on speed after sometime and finishes around 22 mins. As before any difference between these environments is now just pymc version. Then I have added the following at the top of the script (as suggested in the github issue page):
import os
os.environ['OPENBLAS_NUM_THREADS'] = '1'
os.environ['MKL_NUM_THREADS'] = '1'
still gets stuck. Tried also exporting these variables from bash just in case and setting conda env variables too. It doesn’t seem like it will ever end:
|█-------------------------------------------------------------------------------------------| 1.26% [101/8000 08:40<11:18:24 Sampling 4 chains, 0 divergences]
So atleast on my end it does not seem to be a threading issue. Let me know if there is anything I can try.
Can anyone replicate this speed issue on Ubuntu 20.04? Meanwhile I will stick to 5.9.0.