I have built a custom Poisson GLM model class that supports continuous variables, splines, binning and categorical variables, including running predictions. The class has been working very well until some time last week, when we tried to refit an existing model with minor modifications, and the sampling time became infinite.
I began digging into the issue and came with a minimal working example that, on databricks where we run the models, gets stuck when the number of samples exceeds a threshold, but on my home computer works just fine.
The code to generate the example is this:
import numpy as np
import pymc as pm
import pytensor.tensor as at
from patsy import dmatrix
N = 2000
x = np.linspace(0, 100, N)
y = (np.sin(x * np.exp(-0.01 * x)) * np.exp(-0.01*x) * 10) / 10
yy = np.random.poisson(np.exp(y))
N_spl = 6
coords = {'spl': np.arange(N_spl + 2)}
x_norm = (x - x.mean()) / x.std()
with pm.Model(coords=coords) as m:
knots = np.quantile(x_norm, np.linspace(0, 1, N_spl))
B = dmatrix("bs(x, knots=knots, degree=3, include_intercept=True) - 1", {
'knots': knots[1:-1],
'x': x_norm
})
data = np.asarray(B, order="F")
prior = pm.Normal("prior", 0, 0.3, dims='spl')
factor = pm.math.dot(data, prior.T)
mu = at.exp(factor)
y = pm.Poisson("y", mu, observed=yy)
trace = pm.sample()
On databricks, this samples quickly up to and including N=1150
, however, already at N=1155
the sampler never starts sampling, and I’m not sure how to debug this issue. I tried verifying that the packages are all at their version from the last known working session where the model fitting worked, but this didn’t help. I would appreciate any ideas on how to debug this weird issue. On my own computer, I can set N=100_000
without a problem and the model samples.