OverflowError: Python int too large to convert to C long

I’m stuck on an error. I’m looking for a solution, and I also hope to use this case to learn how to debug PyMC models.

The following is a small BART model that reproduces an OverflowError. I encountered the same issue with another model and have found several similar questions, e.g.

The other posts don’t contain any data or code to replicate the issue. I’m hoping this small model can be useful in dealing with the problem.

The data I used is in the following file. And the code reads:

import pymc as pm
import pymc_bart as pmb
import numpy as np

data = np.loadtxt("../data/data.csv", delimiter=",")
print(data.shape)
X = data[:, :-1]
Y = data[:, -1]
print(X.shape, Y.shape)

n_obs = len(Y)

rng = np.random.default_rng(42)

with pm.Model() as model:
    w = pmb.BART("w", X=X, Y=Y, m=100)
    mean = pm.Deterministic("mean", w)
    sigma = pm.HalfNormal("sigma", sigma=0.1)
    y = pm.TruncatedNormal(
        'y',
        mu=mean,
        sigma=sigma,
        lower=-0.1,
        upper=0.1, observed=Y
    )

    idata = pm.sample(random_seed=rng, compute_convergence_checks=False)

data.csv (29.5 KB)

The traceback I got reads:

---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
Cell In[40], line 13
      4 sigma = pm.HalfNormal("sigma", sigma=0.1)
      5 y = pm.TruncatedNormal(
      6     'y',
      7     mu=mean,
   (...)     10     upper=0.1, observed=Y
     11 )
---> 13 idata = pm.sample(random_seed=rng, compute_convergence_checks=False)

File ~/.local/share/virtualenvs/MachineLearning-4jP8FBMK/lib/python3.12/site-packages/pymc/sampling/mcmc.py:928, in sample(draws, tune, chains, cores, random_seed, progressbar, progressbar_theme, step, var_names, nuts_sampler, initvals, init, jitter_max_retries, n_init, trace, discard_tuned_samples, compute_convergence_checks, keep_warning_stat, return_inferencedata, idata_kwargs, nuts_sampler_kwargs, callback, mp_ctx, blas_cores, model, compile_kwargs, **kwargs)
    926 _print_step_hierarchy(step)
    927 try:
--> 928     _mp_sample(**sample_args, **parallel_args)
    929 except pickle.PickleError:
    930     _log.warning("Could not pickle model, sampling singlethreaded.")

File ~/.local/share/virtualenvs/MachineLearning-4jP8FBMK/lib/python3.12/site-packages/pymc/sampling/mcmc.py:1408, in _mp_sample(draws, tune, step, chains, cores, rngs, start, progressbar, progressbar_theme, traces, model, callback, blas_cores, mp_ctx, **kwargs)
   1405 strace = traces[draw.chain]
   1406 if not zarr_recording:
   1407     # Zarr recording happens in each process
-> 1408     strace.record(draw.point, draw.stats)
   1409 log_warning_stats(draw.stats)
   1411 if callback is not None:

File ~/.local/share/virtualenvs/MachineLearning-4jP8FBMK/lib/python3.12/site-packages/pymc/backends/ndarray.py:116, in NDArray.record(self, point, sampler_stats)
    114     for data, vars in zip(self._stats, sampler_stats):
    115         for key, val in vars.items():
--> 116             data[key][draw_idx] = val
    117 elif self._stats is not None:
    118     raise ValueError("Expected sampler_stats")

OverflowError: Python int too large to convert to C long

My environment specifics:

Ubuntu 24.04
Python 3.12
pymc 5.25.1
pymc-bart 0.10.0

Thanks a lot in advance!

1 Like

Some statistic uses python integer that’s larger than what int64 can encode

Would be great to see what that key, val pair is when it fails

What should I do to get the pair?

I guess the output of BART w goes very large, but I don’t know how to check or apply prior knowledge on it, e.g., limit w within a range.

You could try to go into an interactive debugger. Simplest approach is to edit the source code and change it to something like:

try:
    data[key][draw_idx] = val
except OverflowError:
    print(f"Writing summary statistic failed for {key=}, {val=}")
    raise

The python file is located in ~/.local/share/virtualenvs/MachineLearning-4jP8FBMK/lib/python3.12/site-packages/pymc/backends/ndarray.py, and the code at line 116 for you

CC @aloctavodia while we are at it

Here is what I got:

Writing summary statistic failed for key='variable_inclusion', val=124116368418827357883

And the traceback:

---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
Cell In[3], line 13
      4 sigma = pm.HalfNormal("sigma", sigma=0.1)
      5 y = pm.TruncatedNormal(
      6     'y',
      7     mu=mean,
   (...)     10     upper=0.1, observed=Y
     11 )
---> 13 idata = pm.sample(random_seed=rng, compute_convergence_checks=False)

File ~/.local/share/virtualenvs/MachineLearning-4jP8FBMK/lib/python3.12/site-packages/pymc/sampling/mcmc.py:928, in sample(draws, tune, chains, cores, random_seed, progressbar, progressbar_theme, step, var_names, nuts_sampler, initvals, init, jitter_max_retries, n_init, trace, discard_tuned_samples, compute_convergence_checks, keep_warning_stat, return_inferencedata, idata_kwargs, nuts_sampler_kwargs, callback, mp_ctx, blas_cores, model, compile_kwargs, **kwargs)
    926 _print_step_hierarchy(step)
    927 try:
--> 928     _mp_sample(**sample_args, **parallel_args)
    929 except pickle.PickleError:
    930     _log.warning("Could not pickle model, sampling singlethreaded.")

File ~/.local/share/virtualenvs/MachineLearning-4jP8FBMK/lib/python3.12/site-packages/pymc/sampling/mcmc.py:1408, in _mp_sample(draws, tune, step, chains, cores, rngs, start, progressbar, progressbar_theme, traces, model, callback, blas_cores, mp_ctx, **kwargs)
   1405 strace = traces[draw.chain]
   1406 if not zarr_recording:
   1407     # Zarr recording happens in each process
-> 1408     strace.record(draw.point, draw.stats)
   1409 log_warning_stats(draw.stats)
   1411 if callback is not None:

File ~/.local/share/virtualenvs/MachineLearning-4jP8FBMK/lib/python3.12/site-packages/pymc/backends/ndarray.py:117, in NDArray.record(self, point, sampler_stats)
    115 for key, val in vars.items():
    116     try:
--> 117         data[key][draw_idx] = val
    118     except OverflowError:
    119         print(f"Writing summary statistic failed for {key=}, {val=}")

OverflowError: Python int too large to convert to C long

@aloctavodia

Hey everyone, just checking in on the BART model issue. Any sense of when we might have a fix for this?

I’m trying to figure out my next steps—if it’ll be sorted in a few days, I’m happy to wait. But if it’s going to be longer, I might jump in and try implementing it in another language to keep things moving.

If I go that route, what do you think would be our best bet? TensorFlow Probability? Stan? Or something else entirely? I’d love to get your thoughts.

Thanks a lot!

The issue is with the variable inclusion. We used to store it as a list of vectors (or something similar), but recently we switched to integer encoding, this is fine if we keep it as an integer, but we get the error when converted to int64. Not sure how I missed this earlier. We should go back to the previous way of storing the variable inclusion or think of a better alternative.

You could store it in a numpy object array (instead of int64 array). Not sure if arviz will be happy about it when we convert to InferenceData

You can temporarily patch it locally with a similar code you used for debugging. Just pass instead of re-raising the error. Summary statistics aren’t needed for sampling, they’re just there for the user afterwards.

try:
    data[key][draw_idx] = val
except OverflowError:
    pass  # Or store as -1 to be identifiable later

That what we used to do, it works, but then created some issue when saving the inferencedata/netcdf to disk.

@ccyang you can try installing pymc-bart directly from github `pip install git+https://github.com/pymc-devs/pymc-bart.git`, a new release will be ready soon.

1 Like

Upgrade to pymc-bart 0.11.0 solved this issue. Thank you guys!

1 Like