Aha, let’s dig this one up from the past… because right now I’m getting the exact same issue. I debugged all the way down to the same region of code multiprocessing/connection.py:399
and still have no idea how this would be dataset-size dependent.
My pymc model is very different to the above, and I expect the versions of everything are different too. I first saw this using an env based around MacOS 14.7, python 3.11 & pymc v5.16, then thought this would be a good time to upgrade everything anyway, and installed MAcOS 15.2, and a new env with python 3.12, pymc 5.20 etc. All bang up to date,
My only guess is it might be something to do with available RAM, but iStats / htop show that there’s available RAM at the time, so maybe it’s an addressing problem or an object size limit somewhere inside multiprocessing
?? Something maybe to do with multiprocessing — Process-based parallelism — Python 3.13.1 documentation ?
A lousy “fix” is to avoid using multiprocessing by setting cores=1
.
FWIW this is the general summary of my current machine / stack
MacOS 15.2 (Sequoia), Macbook Air M2 24GB RAM
$> xcode-select -v
xcode-select version 2409.
Issue is seen when using any typical IDE:
- VSCode v1.96.4 (Jupyter extension ms-toolsai.jupyter v2024.10.0),
- straight Jupyter Lab launched from terminal
- straight Jupyter Notebook launched from terminal
Environment is fairly normal (I hope), installed via conda-forge
Python implementation: CPython
Python version : 3.12.8
IPython version : 8.31.0
ipykernel : 6.29.5
pymc : 5.20.0
pytensor: 2.26.4
Compiler : Clang 18.1.8
OS : Darwin
Release : 24.2.0
Machine : arm64
Processor : arm
CPU cores : 8
Architecture: 64bit
sys : 3.12.8 | packaged by conda-forge | (main, Dec 5 2024, 14:19:53) [Clang 18.1.8 ]
I think (hope!) I’m using Accelerate and Clang 18.18
$ > mamba list
...
libblas 3.9.0 26_osxarm64_accelerate conda-forge
libcblas 3.9.0 26_osxarm64_accelerate conda-forge
liblapack 3.9.0 26_osxarm64_accelerate conda-forge
...
libclang-cpp18.1 18.1.8 default_h5c12605_5 conda-forge
...
Although import numpy as np; np.__config__.show()
yields references to clang 16.0.6
, which bothers me a little:
Build Dependencies:
blas:
detection method: pkgconfig
found: true
include directory: /Users/jon/miniforge/envs/vulcan/include
lib directory: /Users/jon/miniforge/envs/vulcan/lib
name: blas
openblas configuration: unknown
pc file directory: /Users/jon/miniforge/envs/vulcan/lib/pkgconfig
version: 3.9.0
lapack:
detection method: internal
found: true
include directory: unknown
lib directory: unknown
name: dep4569863840
openblas configuration: unknown
pc file directory: unknown
version: 1.26.4
Compilers:
c:
args: -ftree-vectorize, -fPIC, -fstack-protector-strong, -O2, -pipe, -isystem,
/Users/jon/miniforge/envs/vulcan/include, -fdebug-prefix-map=/Users/runner/miniforge3/conda-bld/numpy_1707225421156/work=/usr/local/src/conda/numpy-1.26.4,
-fdebug-prefix-map=/Users/jon/miniforge/envs/vulcan=/usr/local/src/conda-prefix,
-D_FORTIFY_SOURCE=2, -isystem, /Users/jon/miniforge/envs/vulcan/include, -mmacosx-version-min=11.0
commands: arm64-apple-darwin20.0.0-clang
linker: ld64
linker args: -Wl,-headerpad_max_install_names, -Wl,-dead_strip_dylibs, -Wl,-rpath,/Users/jon/miniforge/envs/vulcan/lib,
-L/Users/jon/miniforge/envs/vulcan/lib, -ftree-vectorize, -fPIC, -fstack-protector-strong,
-O2, -pipe, -isystem, /Users/jon/miniforge/envs/vulcan/include, -fdebug-prefix-map=/Users/runner/miniforge3/conda-bld/numpy_1707225421156/work=/usr/local/src/conda/numpy-1.26.4,
-fdebug-prefix-map=/Users/jon/miniforge/envs/vulcan=/usr/local/src/conda-prefix,
-D_FORTIFY_SOURCE=2, -isystem, /Users/jon/miniforge/envs/vulcan/include, -mmacosx-version-min=11.0
name: clang
version: 16.0.6
c++:
args: -ftree-vectorize, -fPIC, -fstack-protector-strong, -O2, -pipe, -stdlib=libc++,
-fvisibility-inlines-hidden, -fmessage-length=0, -isystem, /Users/jon/miniforge/envs/vulcan/include,
-fdebug-prefix-map=/Users/runner/miniforge3/conda-bld/numpy_1707225421156/work=/usr/local/src/conda/numpy-1.26.4,
-fdebug-prefix-map=/Users/jon/miniforge/envs/vulcan=/usr/local/src/conda-prefix,
-D_FORTIFY_SOURCE=2, -isystem, /Users/jon/miniforge/envs/vulcan/include, -mmacosx-version-min=11.0
commands: arm64-apple-darwin20.0.0-clang++
linker: ld64
linker args: -Wl,-headerpad_max_install_names, -Wl,-dead_strip_dylibs, -Wl,-rpath,/Users/jon/miniforge/envs/vulcan/lib,
-L/Users/jon/miniforge/envs/vulcan/lib, -ftree-vectorize, -fPIC, -fstack-protector-strong,
-O2, -pipe, -stdlib=libc++, -fvisibility-inlines-hidden, -fmessage-length=0,
-isystem, /Users/jon/miniforge/envs/vulcan/include, -fdebug-prefix-map=/Users/runner/miniforge3/conda-bld/numpy_1707225421156/work=/usr/local/src/conda/numpy-1.26.4,
-fdebug-prefix-map=/Users/jon/miniforge/envs/vulcan=/usr/local/src/conda-prefix,
-D_FORTIFY_SOURCE=2, -isystem, /Users/jon/miniforge/envs/vulcan/include, -mmacosx-version-min=11.0
name: clang
version: 16.0.6
cython:
commands: cython
linker: cython
name: cython
version: 3.0.8
Machine Information:
build:
cpu: aarch64
endian: little
family: aarch64
system: darwin
cross-compiled: true
host:
cpu: arm64
endian: little
family: aarch64
system: darwin
Python Information:
path: /Users/jon/miniforge/envs/vulcan/bin/python
version: '3.12'
SIMD Extensions:
baseline:
- NEON
- NEON_FP16
- NEON_VFPV4
- ASIMD
found:
- ASIMDHP
not found:
- ASIMDFHM
Just for fun the relevant part of the stacktrace:
File ~/miniforge/envs/vulcan/lib/python3.12/site-packages/pymc/sampling/mcmc.py:906, in sample(draws, tune, chains, cores, random_seed, progressbar, progressbar_theme, step, var_names, nuts_sampler, initvals, init, jitter_max_retries, n_init, trace, discard_tuned_samples, compute_convergence_checks, keep_warning_stat, return_inferencedata, idata_kwargs, nuts_sampler_kwargs, callback, mp_ctx, blas_cores, model, compile_kwargs, **kwargs)
904 _print_step_hierarchy(step)
905 try:
--> 906 _mp_sample(**sample_args, **parallel_args)
907 except pickle.PickleError:
908 _log.warning("Could not pickle model, sampling singlethreaded.")
File ~/miniforge/envs/vulcan/lib/python3.12/site-packages/pymc/sampling/mcmc.py:1318, in _mp_sample(draws, tune, step, chains, cores, rngs, start, progressbar, progressbar_theme, traces, model, callback, blas_cores, mp_ctx, **kwargs)
1316 try:
1317 with sampler:
-> 1318 for draw in sampler:
1319 strace = traces[draw.chain]
1320 strace.record(draw.point, draw.stats)
File ~/miniforge/envs/vulcan/lib/python3.12/site-packages/pymc/sampling/parallel.py:478, in ParallelSampler.__iter__(self)
471 task = progress.add_task(
472 self._desc.format(self),
473 completed=self._completed_draws,
474 total=self._total_draws,
475 )
477 while self._active:
--> 478 draw = ProcessAdapter.recv_draw(self._active)
479 proc, is_last, draw, tuning, stats = draw
480 self._completed_draws += 1
File ~/miniforge/envs/vulcan/lib/python3.12/site-packages/pymc/sampling/parallel.py:334, in ProcessAdapter.recv_draw(processes, timeout)
332 idxs = {id(proc._msg_pipe): proc for proc in processes}
333 proc = idxs[id(ready[0])]
--> 334 msg = ready[0].recv()
336 if msg[0] == "error":
337 old_error = msg[1]
File ~/miniforge/envs/vulcan/lib/python3.12/multiprocessing/connection.py:250, in _ConnectionBase.recv(self)
248 self._check_closed()
249 self._check_readable()
--> 250 buf = self._recv_bytes()
251 return _ForkingPickler.loads(buf.getbuffer())
File ~/miniforge/envs/vulcan/lib/python3.12/multiprocessing/connection.py:430, in Connection._recv_bytes(self, maxsize)
429 def _recv_bytes(self, maxsize=None):
--> 430 buf = self._recv(4)
431 size, = struct.unpack("!i", buf.getvalue())
432 if size == -1:
File ~/miniforge/envs/vulcan/lib/python3.12/multiprocessing/connection.py:399, in Connection._recv(self, size, read)
397 if n == 0:
398 if remaining == size:
--> 399 raise EOFError
400 else:
401 raise OSError("got end of file during message")
EOFError: