Pm.sample gets stuck after init with cores > 1

@AlexAndorra @lucianopaz

So I’ve tried reducing the no. of predictors to 9 (from 90) and the multi-core (cores=2) sampling started to work. Strange.

If I bump my predictor count up to 73 (cores=2), the sampler gets stuck again. Here’s the stack trace that I get when I ctrl-c the python process:

  File ".conda/envs/DEV/lib/python3.7/site-packages/pymc3/sampling.py", line 1059, in _mp_sample
    for draw in sampler:
  File ".conda/envs/DEV/lib/python3.7/site-packages/pymc3/parallel_sampling.py", line 394, in __iter__
    draw = ProcessAdapter.recv_draw(self._active)
  File ".conda/envs/DEV/lib/python3.7/site-packages/pymc3/parallel_sampling.py", line 284, in recv_draw
    ready = multiprocessing.connection.wait(pipes)
  File ".conda/envs/DEV/lib/python3.7/multiprocessing/connection.py", line 920, in wait
    ready = selector.select(timeout)
  File ".conda/envs/DEV/lib/python3.7/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".conda/envs/DEV/lib/python3.7/site-packages/pymc3/sampling.py", line 1068, in _mp_sample
    trace._add_warnings(draw.warnings)
  File ".conda/envs/DEV/lib/python3.7/site-packages/pymc3/parallel_sampling.py", line 425, in __exit__
    ProcessAdapter.terminate_all(self._samplers)
  File ".conda/envs/DEV/lib/python3.7/site-packages/pymc3/parallel_sampling.py", line 319, in terminate_all
    process.join(timeout)
  File ".conda/envs/DEV/lib/python3.7/site-packages/pymc3/parallel_sampling.py", line 274, in join
    self._process.join(timeout)
  File ".conda/envs/DEV/lib/python3.7/multiprocessing/process.py", line 140, in join
    res = self._popen.wait(timeout)
  File ".conda/envs/DEV/lib/python3.7/multiprocessing/popen_fork.py", line 45, in wait
    if not wait([self.sentinel], timeout):
  File ".conda/envs/DEV/lib/python3.7/multiprocessing/connection.py", line 920, in wait
    ready = selector.select(timeout)
  File ".conda/envs/DEV/lib/python3.7/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 85, in <module>
    trace = pm.sample(draws=3000, chains=4, cores=2, tune=2000)
  File ".conda/envs/DEV/lib/python3.7/site-packages/pymc3/sampling.py", line 469, in sample
    trace = _mp_sample(**sample_args)
  File ".conda/envs/DEV/lib/python3.7/site-packages/pymc3/sampling.py", line 1080, in _mp_sample
    traces, length = _choose_chains(traces, tune)
  File ".conda/envs/DEV/lib/python3.7/site-packages/pymc3/sampling.py", line 1096, in _choose_chains
    raise ValueError("Not enough samples to build a trace.")
ValueError: Not enough samples to build a trace.
Sampling 4 chains, 0 divergences:   0%|                                                                                                              | 0/20000 [01:29<?, ?draws/s]

Any useful info in here? Seems like the parallel sampler is waiting on something - but what?