Negative binomial as a likelihood

Hey every one, I’m trying to solve question 5M16 from the Bayesian Modeling and Computation In Python book, the question is: "Besides “hour” the bike dataset has other covariates, like “temperature”.Fit a splines using both covariates.The simplest way to do this is by defining a separated spline/design matrix for each covariate. Fit a model with a Negative Binomial likelihood.

(a) Run diagnostics to check the sampling is correct and modify the model and or sample hyperparameters accordingly.

(b) How the rented bikes depend on the hours of the day and how on the temperature?

(c) Generate a model with only the hour covariate to the one with the “hour” and “temperature”. Compare both model using LOO, LOO-PIT and posterior predictive checks.

(d) Summarize all your findings"

now i tried solving it with this code, but somehow it is not working for me, something about the negative binomial distribution is causing the model to generate -inf values, if anyone has a suggestion to offer it’ll be life saving for me, here is the code that i used:

%%capture
!wget https://raw.githubusercontent.com/BayesianModelingandComputationInPython/BookCode_Edition1/main/data/bikes_hour.csv
bikes = pd.read_csv("bikes_hour.csv")
bikes.sort_values(by="hour", inplace=True)
bikes["count_normalized"] = (bikes["count"] - bikes["count"].mean()) / bikes["count"].std()
N_knots = 18
knot_list_hour = np.quantile(bikes.hour, np.linspace(0, 1, N_knots))[1:-1]
B_hour = dmatrix("bs(cnt, knots=knots, degree=3, include_intercept=True) - 1",{"cnt": bikes.hour.values, "knots": knot_list_hour})
N_knots2 = 12
knot_list_temp = np.quantile(bikes.temperature, np.linspace(0, 1, N_knots2))[1:-1]
# knot_list_temp = np.linspace(bikes.temperature.min(), bikes.temperature.max(), N_knots2)[1:-1]
B_temp = dmatrix("bs(cnt, knots=knots, degree=3, include_intercept=True) - 1",{"cnt": bikes.temperature.values, "knots": knot_list_temp})
with pm.Model() as m1:
    τh = pm.HalfCauchy('τh', 1)
    βh = pm.Normal("βh", mu=0, sigma=τh, shape=B_hour.shape[1])
    τt = pm.HalfCauchy('τt', 0.1)
    βt = pm.Normal("βt", mu=0, sigma=τt, shape=B_temp.shape[1])

    μh = pm.Deterministic("μh", pm.math.dot(np.asfortranarray(B_hour), βh))
    μt = pm.Deterministic("μt", pm.math.dot(np.asfortranarray(B_temp), βt))
    μ = pm.Deterministic("μ", μh + μt)

    α = pm.Exponential("α", 1)
    y = pm.NegativeBinomial("y", mu= μ, alpha=α, observed=bikes["count_normalized"].values)
    idata1 = pm.sample(1000, tune=2000, chains=2, target_accept=0.9)

Welcome!

I would assume the difficulty is coming from the fact that \mu must be >0 and I don’t immediately see anything in your model that ensures that this is true.

Can you suggest a way to do that?

Well for testing purposes you can try using the following to clip it below by 0

https://www.pymc.io/projects/docs/en/stable/api/generated/pymc.math.clip.html

as such

pm.math.clip(μ, 0, np.inf)

and see if that resolves the problem, if it does then you know what needs to be fixed. However whether or not that is sensical in this context depends on your model. Alternatively, if the entries for B_hour and B_temp are positive (I could not check because I don’t know where dmatrix comes from is it xgboost?), then you can maybe use halfnormal for your βh

    βh = pm.HalfNormal("βh", sigma=τh, shape=B_hour.shape[1])
    βt = pm.HalfNormal("βt", sigma=τh, shape=B_hour.shape[1])

in which then

μ = pm.Deterministic("μ", μh + μt)

would be positive. However again, whether or not HalfNormals are sensical for these priors depends on what you want from the model. But in the end the mu parameter for the negative binomial represents the average number of failures in a sequence of binomial trials before the nth success, so there is really no sense in having a negative value for this parameter. If there is no “model sensical” way to make this parameter mu positive then you might want to re-evaluate your model or your choice of likelihood maybe?

Can you for instance use μh, μt to produce a probability parameter instead to supply it to NegativeBinomial. For instance you can try to pass it through some sigmoid function to get something between 1 and 0 or through some exponential function. This would be the best approach I think.

Also have a look here

and perhaps the term “Negative Binomial Regression” for more insight.