I’m stuck on an error. I’m looking for a solution, and I also hope to use this case to learn how to debug PyMC models.
The following is a small BART model that reproduces an OverflowError. I encountered the same issue with another model and have found several similar questions, e.g.
The other posts don’t contain any data or code to replicate the issue. I’m hoping this small model can be useful in dealing with the problem.
The data I used is in the following file. And the code reads:
import pymc as pm
import pymc_bart as pmb
import numpy as np
data = np.loadtxt("../data/data.csv", delimiter=",")
print(data.shape)
X = data[:, :-1]
Y = data[:, -1]
print(X.shape, Y.shape)
n_obs = len(Y)
rng = np.random.default_rng(42)
with pm.Model() as model:
w = pmb.BART("w", X=X, Y=Y, m=100)
mean = pm.Deterministic("mean", w)
sigma = pm.HalfNormal("sigma", sigma=0.1)
y = pm.TruncatedNormal(
'y',
mu=mean,
sigma=sigma,
lower=-0.1,
upper=0.1, observed=Y
)
idata = pm.sample(random_seed=rng, compute_convergence_checks=False)
data.csv (29.5 KB)
The traceback I got reads:
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
Cell In[40], line 13
4 sigma = pm.HalfNormal("sigma", sigma=0.1)
5 y = pm.TruncatedNormal(
6 'y',
7 mu=mean,
(...) 10 upper=0.1, observed=Y
11 )
---> 13 idata = pm.sample(random_seed=rng, compute_convergence_checks=False)
File ~/.local/share/virtualenvs/MachineLearning-4jP8FBMK/lib/python3.12/site-packages/pymc/sampling/mcmc.py:928, in sample(draws, tune, chains, cores, random_seed, progressbar, progressbar_theme, step, var_names, nuts_sampler, initvals, init, jitter_max_retries, n_init, trace, discard_tuned_samples, compute_convergence_checks, keep_warning_stat, return_inferencedata, idata_kwargs, nuts_sampler_kwargs, callback, mp_ctx, blas_cores, model, compile_kwargs, **kwargs)
926 _print_step_hierarchy(step)
927 try:
--> 928 _mp_sample(**sample_args, **parallel_args)
929 except pickle.PickleError:
930 _log.warning("Could not pickle model, sampling singlethreaded.")
File ~/.local/share/virtualenvs/MachineLearning-4jP8FBMK/lib/python3.12/site-packages/pymc/sampling/mcmc.py:1408, in _mp_sample(draws, tune, step, chains, cores, rngs, start, progressbar, progressbar_theme, traces, model, callback, blas_cores, mp_ctx, **kwargs)
1405 strace = traces[draw.chain]
1406 if not zarr_recording:
1407 # Zarr recording happens in each process
-> 1408 strace.record(draw.point, draw.stats)
1409 log_warning_stats(draw.stats)
1411 if callback is not None:
File ~/.local/share/virtualenvs/MachineLearning-4jP8FBMK/lib/python3.12/site-packages/pymc/backends/ndarray.py:116, in NDArray.record(self, point, sampler_stats)
114 for data, vars in zip(self._stats, sampler_stats):
115 for key, val in vars.items():
--> 116 data[key][draw_idx] = val
117 elif self._stats is not None:
118 raise ValueError("Expected sampler_stats")
OverflowError: Python int too large to convert to C long
My environment specifics:
Ubuntu 24.04
Python 3.12
pymc 5.25.1
pymc-bart 0.10.0
Thanks a lot in advance!