Hey every one, I’m trying to solve question 5M16 from the Bayesian Modeling and Computation In Python book, the question is: "Besides “hour” the bike dataset has other covariates, like “temperature”.Fit a splines using both covariates.The simplest way to do this is by defining a separated spline/design matrix for each covariate. Fit a model with a Negative Binomial likelihood.
(a) Run diagnostics to check the sampling is correct and modify the model and or sample hyperparameters accordingly.
(b) How the rented bikes depend on the hours of the day and how on the temperature?
(c) Generate a model with only the hour covariate to the one with the “hour” and “temperature”. Compare both model using LOO, LOO-PIT and posterior predictive checks.
(d) Summarize all your findings"
now i tried solving it with this code, but somehow it is not working for me, something about the negative binomial distribution is causing the model to generate -inf values, if anyone has a suggestion to offer it’ll be life saving for me, here is the code that i used:
%%capture
!wget https://raw.githubusercontent.com/BayesianModelingandComputationInPython/BookCode_Edition1/main/data/bikes_hour.csv
bikes = pd.read_csv("bikes_hour.csv")
bikes.sort_values(by="hour", inplace=True)
bikes["count_normalized"] = (bikes["count"] - bikes["count"].mean()) / bikes["count"].std()
N_knots = 18
knot_list_hour = np.quantile(bikes.hour, np.linspace(0, 1, N_knots))[1:-1]
B_hour = dmatrix("bs(cnt, knots=knots, degree=3, include_intercept=True) - 1",{"cnt": bikes.hour.values, "knots": knot_list_hour})
N_knots2 = 12
knot_list_temp = np.quantile(bikes.temperature, np.linspace(0, 1, N_knots2))[1:-1]
# knot_list_temp = np.linspace(bikes.temperature.min(), bikes.temperature.max(), N_knots2)[1:-1]
B_temp = dmatrix("bs(cnt, knots=knots, degree=3, include_intercept=True) - 1",{"cnt": bikes.temperature.values, "knots": knot_list_temp})
with pm.Model() as m1:
τh = pm.HalfCauchy('τh', 1)
βh = pm.Normal("βh", mu=0, sigma=τh, shape=B_hour.shape[1])
τt = pm.HalfCauchy('τt', 0.1)
βt = pm.Normal("βt", mu=0, sigma=τt, shape=B_temp.shape[1])
μh = pm.Deterministic("μh", pm.math.dot(np.asfortranarray(B_hour), βh))
μt = pm.Deterministic("μt", pm.math.dot(np.asfortranarray(B_temp), βt))
μ = pm.Deterministic("μ", μh + μt)
α = pm.Exponential("α", 1)
y = pm.NegativeBinomial("y", mu= μ, alpha=α, observed=bikes["count_normalized"].values)
idata1 = pm.sample(1000, tune=2000, chains=2, target_accept=0.9)