SamplingError with Gamma dist for censored model

Hello, everyone!

I am modeling a censored data with the pm.Censored distribution, leveraging a pm.Gamma distribution for the likelihood. My model looks something like this (very simplified):

with pm.Model(coords=coords) as toy_model:
    var1 = pm.MutableData("var1", df["var1"].values, dims="obs_id")
    var2 = pm.MutableData("var2", df["var2"].values, dims="obs_id")

    a = pm.Uniform("a", lower=0, upper=10)
    b = pm.Normal("b", mu=0, sigma=3)

    mu = pm.Deterministic("mu", pm.math.exp(
        (a + b * var2) * pm.math.log(var1)
    ))

    sigma = pm.HalfNormal("sigma", sigma=2)
    
    y_obs = df["y_obs"].replace({0: 1}).values
    upper = np.where(df["is_censored"] == 1, df["y_obs"], np.inf)
    
    gamma_dist = pm.Gamma.dist(mu=mu, sigma=sigma)
    Y_obs = pm.Censored("Y_obs", gamma_dist, lower=None, upper=upper, observed=y_obs)

The problem is, no matter how I tune and reparametrize my model, I keep getting SamplingErrors due to failures on my model’s initial evaluation at starting point, with Y_obs being -np.inf.

Some things I’ve already tested:

  1. It works smoothly with a pm.Normal distribution if I define mu = (a + b * var2) * log(var1) and take the logarithm of y_obs. So I believe it’s something related to the pm.Gamma itself. My guess here is that maybe some shenanigans are happening around the alpha, beta, mu and sigma positiveness constraint.
  2. Also, if I define upper = [np.inf] * len(df), it works smoothly even with the pm.Gamma distribution. So I believe the problem is on the evaluation of 1 - CDF(upper, dist) when x = upper. See pm.Censored docs.
  3. I’ve tried around with setting initvals for each parameter, but I understand that’s not the best practice here. It didn’t solve my problem as well.
  4. I understand that a possible cause for Y_obs = -np.inf may be that my priors are not able to generate the observed data. I’ve done some prior predictive sampling, looking at histograms for Y_obs and it looks good. I’ve also checked mu for positiveness constraint, and it looks okay as well. Not sure what else I should investigate here.

I’m kinda stuck here and appreciate any help!

Does model.debug() give any useful feedback?

Did you double check some y_obs are not larger than corresponding upper? You use y_obs as your observed but upper is determined by df[“y_obs”] which has 0s where y_jobs has 1.

2 Likes

Sorry for the late reply, you are absolutely right! I changed that on my real model and it solved the problem. I still had some problems with the sampling of the gamma distribution as the likelihood, but I’ve changed it to an exponential distribution and it looks better - also a bit faster.

Thank you very much!