Hi there! I am using pymc3 to fit data corresponding to the Fourier transform of a signal. I have a codebase that is rather complicated, but I can reproduce the behavior with a simple snippet, shown below, which just generates a brief signal and fits the real and imaginary parts of its Fourier transform.
The issue is that on my previous machine, running Ubuntu 18.04 with an Intel processor, pm.sample(chains=1) would use by default 6 cores (the machine has 12 if counting hyperthreading). This was with no modification to any environment variables, and with no .theanorc. On a new machine with a fresh install of Ubuntu 20.04 with an AMD processor with 8 cores (16 with hyperthreading), it never uses more than 1 core. Even setting OMP, MKL, GOTO, and OPENBLAS_NUM_THREADS to 16 (a scattershot approach, admittedly) does nothing to change this behavior. However, running check_blas.py in the theano/misc/ directory shows default usage of 8 cores, down to any number less than 8 when MKL_NUM_THREADS is changed. More to the point, the numpy.config.show() output indicates MKL is being used, which I know is not great on AMD, but should still utilize more than one core for large array operations (and indeed does so when check_blas.py is run).
Could MKL be causing issues in the context of pymc3 in some way that does not come up in the theano check_blas.py? What setup and installation procedure would you prescribe for a system using AMD? I have not built numpy or theano from source before so I would need some guidance in how to build them using non-default blas libraries.
Please provide a minimal, self-contained, and reproducible example.
import pymc3 as pm
import numpy as np
length = 100
x_pos = np.linspace(0,99,length)
signal = np.random.rand(length)
u_pos = np.linspace(0, 10, 10000)
fourier_matrix_real = np.exp(np.cos(2.0*np.pi*np.outer(u_pos, x_pos)))
fourier_matrix_imag = np.exp(np.sin(2.0*np.pi*np.outer(u_pos, x_pos)))
data_real = np.dot(fourier_matrix_real, signal)
data_imag = np.dot(fourier_matrix_imag, signal)
sd = 0.1*np.sqrt(data_real**2+data_imag**2)
with pm.Model() as model:
signal_prior = pm.Uniform('signal',lower=0,upper=1,shape=length)
prediction_real = pm.math.dot(fourier_matrix_real, signal_prior)
prediction_imag = pm.math.dot(fourier_matrix_imag, signal_prior)
likelihood_real = pm.Normal('real_FT',mu=prediction_real,sd=sd,observed = data_real)
likelihood_imag = pm.Normal('imag_FT',mu=prediction_imag,sd=sd,observed = data_real)
posterior = pm.sample(chains=1)
Please provide the full traceback.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [signal]
#progress bar, no errors
Please provide any additional information below.
Here is the output of numpy.config.show()
blas_mkl_info:
libraries = [‘mkl_rt’, ‘pthread’]
library_dirs = [’/home/daniel/anaconda3/envs/py38/lib’]
define_macros = [(‘SCIPY_MKL_H’, None), (‘HAVE_CBLAS’, None)]
include_dirs = [’/home/daniel/anaconda3/envs/py38/include’]
blas_opt_info:
libraries = [‘mkl_rt’, ‘pthread’]
library_dirs = [’/home/daniel/anaconda3/envs/py38/lib’]
define_macros = [(‘SCIPY_MKL_H’, None), (‘HAVE_CBLAS’, None)]
include_dirs = [’/home/daniel/anaconda3/envs/py38/include’]
lapack_mkl_info:
libraries = [‘mkl_rt’, ‘pthread’]
library_dirs = [’/home/daniel/anaconda3/envs/py38/lib’]
define_macros = [(‘SCIPY_MKL_H’, None), (‘HAVE_CBLAS’, None)]
include_dirs = [’/home/daniel/anaconda3/envs/py38/include’]
lapack_opt_info:
libraries = [‘mkl_rt’, ‘pthread’]
library_dirs = [’/home/daniel/anaconda3/envs/py38/lib’]
define_macros = [(‘SCIPY_MKL_H’, None), (‘HAVE_CBLAS’, None)]
include_dirs = [’/home/daniel/anaconda3/envs/py38/include’]
Versions and main components
- PyMC3 Version: 3.9.3
- Theano Version: 1.0.5
- Python Version: 3.8.5
- Operating system: Ubuntu 20.04
- How did you install PyMC3: conda, no flags besides -c conda-forge
Thank you for your attention, and for developing pymc3!