Time varying survival model (poisson regression) not porting to pymc v4

I have a hierarchical time-varying weibull survival model with a model structure based off of this pymc3 example

It works fine for pymc3 (3.11.5), but it ends up with 100% divergences in pymc v4 (4.1.4). When I look at the posterior, every draw within chains is identical
(i.e. for all i, 0:1000, this returns the same array idata1.posterior["beta"].sel({"chain":0, "draw": i }). This also occurs for all RVs, not just beta ). I ran into the same issue where all of the draws were identical in this thread as well.

Any ideas?

Had to include this one as a notebook since the simulation code is so long, but here’s the model code below

    with pm.Model(coords=coords) as model:

        # style level parameters
        log_k = pm.Normal('log_k', 0.4, 0.4, dims="group")
        log_lambd = pm.Normal('log_lambd', 3.5, 0.5, dims="group")

        # Helper variables
        k = pm.Deterministic("k", pm.math.exp(log_k), dims="group")
        lambd = pm.Deterministic("lambd", pm.math.exp(log_lambd), dims="group")

        # base hazard
        lambda0 = pm.Deterministic("lambda0", k/lambd * (intervals[:,None]/lambd)**(k-1), dims=("intervals", "group") )

        # constrain the sd to 0.5, since I think 99% of the effect below 5x impact ( exp(0.5*3) = 4.5 )
        beta = pm.Normal("beta", 0.0, 0.5, dims="group")
        
        # hazard
        lambda_ = pm.Deterministic("lambda_", pm.math.exp(beta[g_i] * X.T) * lambda0[:, g_i] )

        # force the likelihood to be 0 for any time point that a unit never reached
        mu = pm.Deterministic("mu", exposure * lambda_.T)
        obs = pm.Poisson("obs", mu, observed=damage) 

And to see the full code, just download the attachment and change the .py extension to .ipynb

pymc-time varying.py (16.5 KB)

1 Like

Maybe it is related to the initial value? Could you check in v4 with pm.sample(..., discard_tuned_samples=False) and see if the samples in trace.warmup_posterior also stuck?

the trace.warmup_posterior suffers from the same issues of being stuck.

If it helps, these are the logps of the initial value (same for both pymc3 and pymc v4)

{'log_k': -0.05, 'log_lambd': -4.52, 'beta': -4.52, 'obs': -4265.52}

I just noticed something very strange - the problem is resolved when I get rid of the dims for "group" - is it possible there’s a bug related to the dims API?

Also note that the point logps of the initial point end up being different with this complete pooling model {'log_k': -0.0, 'log_lambd': -0.23, 'beta': -0.23, 'obs': -4265.52}

Definitely seems like a bug, could you check whether there is missing value or anything funny happening in that column?

I just double checked, I’m not seeing any missing values, mislabeled groups, or indices without a corresponding coord. The code works fine when running in pymc3 as well

@michaelosthege for dims expert opinion.

I’d recommend to compare the variable shapes to see if they are the same between the v3 and v4 models.

If you think there’s a bug with dims-informed shape of a variable, you could pass the explicit shape just to make sure that the RV is not getting resized because of dims.

So I checked the shapes and they’re the same between pymc3 and pymc v4 which makes sense because the code is the same. I checked via idata.posterior.dims and also just looking at the shapes of the data variables.

I used the explicit shape argument and it also runs into the same problems as when using dims.

For some reason this model specification just doesnt fit in pymc v4 even though it did in pymc3, however it does fit without any group level indexing in pymc v4.

Not sure how to troubleshoot and identify the problem here because it should work