NUT sampler stuck under windows with njobs>1

Hello,

When I was trying to run a PYMC3 example, my sampler got stuck even with a very simple algorithm. Here is the example I was running:

import pymc3 as pm
model = pm.Model()
with model:
    mu1 = pm.Normal("mu1", mu=0, sd=1, shape=10)
with model:
    step = pm.NUTS()
    trace = pm.sample(2000, tune=1000, init=None, step=step, njobs=2)

Here is what the output., I can tell the program was running since the CPU was being used. But it stuck here forever. Any idea what is going on?
image

I was running PYMC3 (3.3) with Anaconda 2 on a windows 10 machine.

You should use the default sampling initialization, which optimized the step size, mass matrix etc in the NUTS sampler:

with model:
    trace = pm.sample(2000, tune=1000, njobs=2)

What you are doing (specifying the step and pass to trace) turn off the initialization and NUTS will run with the default parameters without tuning, which in most case results in suboptimal performace.

Thanks for the reply, I have tried what you suggested but it still gets stuck with a similar output like this:

image

Hmmm that might be a joblib problem then, could you please try setting njobs=1?

Thanks for the reply.
It starts to sample now! How should I solve this joblib problem?

It is somehow a mystery to me as well unfortunately :sweat_smile:
For me (on macOS) joblib fail whenever the model size is too big, but it always gives a python crash warning. In your case, do you see any warning/error at all? Something like a pickle error?

Thank you for your reply. So I tried to execute the original file (njob=2) with a terminal but not with the Spyder, some error messages come up:

(base) C:\Users\linyu\Desktop>python untitled1.py
E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [mu1]
E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters

You can find the C code in this temporary file: c:\users\linyu\appdata\local\temp\theano_compilation_error_ysdudd
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "E:\Anaconda2\lib\multiprocessing\forking.py", line 380, in main
    prepare(preparation_data)
  File "E:\Anaconda2\lib\multiprocessing\forking.py", line 510, in prepare
    '__parents_main__', file, path_name, etc
  File "C:\Users\linyu\Desktop\untitled1.py", line 12, in <module>
    mu1 = pm.Normal("mu1", mu=0, sd=1, shape=10)
  File "E:\Anaconda2\lib\site-packages\pymc3\distributions\distribution.py", line 37, in __new__
    return model.Var(name, dist, data, total_size)
  File "E:\Anaconda2\lib\site-packages\pymc3\model.py", line 752, in Var
    total_size=total_size, model=self)
  File "E:\Anaconda2\lib\site-packages\pymc3\model.py", line 1137, in __init__
    self.scaling = _get_scaling(total_size, self.shape, self.ndim)
  File "E:\Anaconda2\lib\site-packages\theano\tensor\var.py", line 275, in <lambda>
    shape = property(lambda self: theano.tensor.basic.shape(self))
  File "E:\Anaconda2\lib\site-packages\theano\gof\op.py", line 670, in __call__
    no_recycling=[])
  File "E:\Anaconda2\lib\site-packages\theano\gof\op.py", line 955, in make_thunk
    no_recycling)
  File "E:\Anaconda2\lib\site-packages\theano\gof\op.py", line 858, in make_c_thunk
    output_storage=node_output_storage)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1217, in make_thunk
    keep_lock=keep_lock)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1157, in __compile__
    keep_lock=keep_lock)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1620, in cthunk_factory
    key=key, lnk=self, keep_lock=keep_lock)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cmodule.py", line 1174, in module_from_key
    module = lnk.compile_cmodule(location)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1523, in compile_cmodule
    preargs=preargs)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cmodule.py", line 2362, in compile_str
    (status, compile_stderr.replace('\n', '. ')))
Exception: ('Compilation failed (return status=3): ', '[Shape(mu1)]')
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFA094494C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFA30D37EDD  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFA31DA1FE4  Unknown               Unknown  Unknown
ntdll.dll          00007FFA33B3EFB1  Unknown               Unknown  Unknown
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFA094494C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFA30D37EDD  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFA31DA1FE4  Unknown               Unknown  Unknown
ntdll.dll          00007FFA33B3EFB1  Unknown               Unknown  Unknown

(base) C:\Users\linyu\Desktop>E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [mu1]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "E:\Anaconda2\lib\multiprocessing\forking.py", line 380, in main
    prepare(preparation_data)
  File "E:\Anaconda2\lib\multiprocessing\forking.py", line 510, in prepare
    '__parents_main__', file, path_name, etc
  File "C:\Users\linyu\Desktop\untitled1.py", line 16, in <module>
    trace = pm.sample(2000, tune=1000, njobs=2)
  File "E:\Anaconda2\lib\site-packages\pymc3\sampling.py", line 420, in sample
    trace = _mp_sample(**sample_args)
  File "E:\Anaconda2\lib\site-packages\pymc3\sampling.py", line 950, in _mp_sample
    traces = Parallel(n_jobs=njobs)(jobs)
  File "E:\Anaconda2\lib\site-packages\joblib\parallel.py", line 749, in __call__
    n_jobs = self._initialize_backend()
  File "E:\Anaconda2\lib\site-packages\joblib\parallel.py", line 547, in _initialize_backend
    **self._backend_args)
  File "E:\Anaconda2\lib\site-packages\joblib\_parallel_backends.py", line 305, in configure
    '[joblib] Attempting to do parallel computing '
ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information

Did you try to follow this?

ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information

Your file should look something like this

import pymc3 as pm
model = pm.Model()
with model:
    mu1 = pm.Normal("mu1", mu=0, sd=1, shape=10)
with model:
    step = pm.NUTS()
    trace = pm.sample(2000, tune=1000, init=None, step=step, njobs=2)

if __name__ == '__main__':
    run()

Here is what I’m trying since it requires no code run outside the if statement:

import pymc3 as pm
    
if __name__ == '__main__':
    model = pm.Model()
    with model:
        mu1 = pm.Normal("mu1", mu=0, sd=1, shape=10)
    with model:
        trace = pm.sample(2000, tune=1000, njobs=2)

Now it gets stuck somewhere else…

(base) C:\Users\linyu\Desktop>python untitled1.py
E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [mu1]
E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Process PoolWorker-1:
Traceback (most recent call last):
  File "E:\Anaconda2\lib\multiprocessing\process.py", line 267, in _bootstrap
    self.run()
  File "E:\Anaconda2\lib\multiprocessing\process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "E:\Anaconda2\lib\multiprocessing\pool.py", line 102, in worker
    task = get()
  File "E:\Anaconda2\lib\site-packages\joblib\pool.py", line 362, in get
    return recv()
  File "E:\Anaconda2\lib\site-packages\pymc3\step_methods\arraystep.py", line 39, in __new__
    model = modelcontext(kwargs.get('model'))
  File "E:\Anaconda2\lib\site-packages\pymc3\model.py", line 147, in modelcontext
    return Model.get_context()
  File "E:\Anaconda2\lib\site-packages\pymc3\model.py", line 139, in get_context
    raise TypeError("No context on context stack")
TypeError: No context on context stack
E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Process PoolWorker-2:
Traceback (most recent call last):
  File "E:\Anaconda2\lib\multiprocessing\process.py", line 267, in _bootstrap
    self.run()
  File "E:\Anaconda2\lib\multiprocessing\process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "E:\Anaconda2\lib\multiprocessing\pool.py", line 102, in worker
    task = get()
  File "E:\Anaconda2\lib\site-packages\joblib\pool.py", line 362, in get
    return recv()
  File "E:\Anaconda2\lib\site-packages\pymc3\step_methods\arraystep.py", line 39, in __new__
    model = modelcontext(kwargs.get('model'))
  File "E:\Anaconda2\lib\site-packages\pymc3\model.py", line 147, in modelcontext
    return Model.get_context()
  File "E:\Anaconda2\lib\site-packages\pymc3\model.py", line 139, in get_context
    raise TypeError("No context on context stack")
TypeError: No context on context stack
E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters

I notice that there is a similar question: https://github.com/pymc-devs/pymc3/issues/2383

So I tried with the code in that issue:

import numpy as np

# Initialize random number generator
np.random.seed(123)

# True parameter values
alpha, sigma = 1, 1
beta = [1, 2.5]

# Size of dataset
size = 100

# Predictor variable
X1 = np.random.randn(size)
X2 = np.random.randn(size) * 0.2

# Simulate outcome variable
Y = alpha + beta[0] * X1 + beta[1] * X2 + np.random.randn(size) * sigma

import pymc3 as pm

basic_model = pm.Model()

with basic_model:
    # Priors for unknown model parameters
    alpha = pm.Normal('alpha', mu=0, sd=10)
    beta = pm.Normal('beta', mu=0, sd=10, shape=2)
    sigma = pm.HalfNormal('sigma', sd=1)

    # Expected value of outcome
    mu = alpha + beta[0] * X1 + beta[1] * X2

    # Likelihood (sampling distribution) of observations
    Y_obs = pm.Normal('Y_obs', mu=mu, sd=sigma, observed=Y)

# draw 500 posterior samples
if __name__ == '__main__':
    with basic_model:
        trace = pm.sample(njobs=2)

The error message is as following:

(base) C:\Users\linyu\Desktop>python untitled1.py
E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [sigma_log__, beta, alpha]
E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters

You can find the C code in this temporary file: c:\users\linyu\appdata\local\temp\theano_compilation_error_g9x813
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "E:\Anaconda2\lib\multiprocessing\forking.py", line 380, in main
    prepare(preparation_data)
  File "E:\Anaconda2\lib\multiprocessing\forking.py", line 510, in prepare
    '__parents_main__', file, path_name, etc
  File "C:\Users\linyu\Desktop\untitled1.py", line 33, in <module>
    alpha = pm.Normal('alpha', mu=0, sd=10)
  File "build\bdist.win-amd64\egg\pymc3\distributions\distribution.py", line 37, in __new__
  File "build\bdist.win-amd64\egg\pymc3\model.py", line 752, in Var
  File "build\bdist.win-amd64\egg\pymc3\model.py", line 1137, in __init__
  File "E:\Anaconda2\lib\site-packages\theano\tensor\var.py", line 275, in <lambda>
    shape = property(lambda self: theano.tensor.basic.shape(self))
  File "E:\Anaconda2\lib\site-packages\theano\gof\op.py", line 670, in __call__
    no_recycling=[])
  File "E:\Anaconda2\lib\site-packages\theano\gof\op.py", line 955, in make_thunk
    no_recycling)
  File "E:\Anaconda2\lib\site-packages\theano\gof\op.py", line 858, in make_c_thunk
    output_storage=node_output_storage)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1217, in make_thunk
    keep_lock=keep_lock)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1157, in __compile__
    keep_lock=keep_lock)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1620, in cthunk_factory
    key=key, lnk=self, keep_lock=keep_lock)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cmodule.py", line 1174, in module_from_key
    module = lnk.compile_cmodule(location)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cc.py", line 1523, in compile_cmodule
    preargs=preargs)
  File "E:\Anaconda2\lib\site-packages\theano\gof\cmodule.py", line 2362, in compile_str
    (status, compile_stderr.replace('\n', '. ')))
Exception: ('Compilation failed (return status=3): ', '[Shape(alpha)]')
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFA094494C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFA30D37EDD  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFA31DA1FE4  Unknown               Unknown  Unknown
ntdll.dll          00007FFA33B3EFB1  Unknown               Unknown  Unknown
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFA094494C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFA30D37EDD  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFA31DA1FE4  Unknown               Unknown  Unknown
ntdll.dll          00007FFA33B3EFB1  Unknown               Unknown  Unknown

(base) C:\Users\linyu\Desktop>E:\Anaconda2\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters

If I change the njob =1, the code works just fine.

If you are running it from the terminal $ python code.py, then you should put

if __name__ == '__main__':
    run()

at the end of the code. Try saving below into a .py file

import pymc3 as pm
model = pm.Model()
with model:
    mu1 = pm.Normal("mu1", mu=0, sd=1, shape=10)
with model:
    step = pm.NUTS()
    trace = pm.sample(2000, tune=1000, init=None, step=step, njobs=2)

if __name__ == '__main__':
    run()

@junpenglao , your example fails for me with the same errors as previously reported.
I’m on windows 10 with pymc3.3, conda environment created with

conda create -n test python=3.6 pymc3

I can post the versions of all packages that installs, if it’ll help

I have the same problem。 firstly,I don’t use .theanorc config. when I run some example , jupyter or python will crash ,report “‘The following error happened while compiling the node’, InplaceDimShuffle{0,x}(Sum{axis=[1], acc_dtype=float64}.0), ‘\n’, 'Compilation failed (return status=3)” . some case it can works well .
and , I add .theanorc config . both “device=cuda” and “device=cpu” , I must set “cores=1”, if not, always “[Errno 32] Broken pipe”. set “cores=1” , the speed is slow…