[SOLVED] Need help debugging a simple GLM spline model

omrihar · July 17, 2023, 6:07am

I have built a custom Poisson GLM model class that supports continuous variables, splines, binning and categorical variables, including running predictions. The class has been working very well until some time last week, when we tried to refit an existing model with minor modifications, and the sampling time became infinite.
I began digging into the issue and came with a minimal working example that, on databricks where we run the models, gets stuck when the number of samples exceeds a threshold, but on my home computer works just fine.

The code to generate the example is this:

import numpy as np
import pymc as pm
import pytensor.tensor as at
from patsy import dmatrix

N = 2000
x = np.linspace(0, 100, N)
y = (np.sin(x * np.exp(-0.01 * x)) * np.exp(-0.01*x) * 10) / 10
yy = np.random.poisson(np.exp(y))

N_spl = 6
coords = {'spl': np.arange(N_spl + 2)}
x_norm = (x - x.mean()) / x.std()
with pm.Model(coords=coords) as m:
    knots = np.quantile(x_norm, np.linspace(0, 1, N_spl))
    B = dmatrix("bs(x, knots=knots, degree=3, include_intercept=True) - 1", {
        'knots': knots[1:-1],
        'x': x_norm
    })
    data = np.asarray(B, order="F")

    prior = pm.Normal("prior", 0, 0.3, dims='spl')
    factor = pm.math.dot(data, prior.T)
    mu = at.exp(factor)
    y = pm.Poisson("y", mu, observed=yy)

    trace = pm.sample()

On databricks, this samples quickly up to and including N=1150, however, already at N=1155 the sampler never starts sampling, and I’m not sure how to debug this issue. I tried verifying that the packages are all at their version from the last known working session where the model fitting worked, but this didn’t help. I would appreciate any ideas on how to debug this weird issue. On my own computer, I can set N=100_000 without a problem and the model samples.

omrihar · July 17, 2023, 8:52am

The issue was related to databricks runtime 12.2LTS. I don’t know why this issue was caused there, but upgrading to 13.2 fixed the issue, just in case someone comes across the same problem.

Topic		Replies	Views
GLM model getting stuck Questions	1	625	October 25, 2018
Regression model sampling solely one sample every 5th second version agnostic modeling	2	511	August 29, 2022
Sampler initiation stuck v5 modeling	7	202	April 30, 2024
Problems sampling a Poisson single hit model	10	71	September 27, 2024
Pm.sample gets stuck after init with cores > 1 Questions	17	3920	January 4, 2021

[SOLVED] Need help debugging a simple GLM spline model

Related topics