I’m trying to apply a similar model to the one defined in section 2.2 of this paper. Here is the stan code for the model, if that helps.
My data has the following columns: user_id, t, l, y, total_revenue. Each row is a daily discretized observation window. t is calendar time of that day, encoded as days-since-some-start-date. l is the cumulative day count since the first observation for that user (i.e. lifetime). y is whether the user made a purchase on that day. and total_revenue is the cumulative revenue observed for that user up through that day.
The data has 10,000 users and 57,89,283 rows.
Here’s my model:
with pm.Model():
# GP kernel hyperpriors for calendar time effects
etasq_short = pm.Normal("etasq_short", 0, 5)
etasq_long = pm.Normal("etasq_long", 0, 5)
rhosq_cal = pm.Normal("rhosq_cal", T/2, T)
# hyperprior for user lifetime effect
etasq_life = pm.Normal("etasq_life", 0, 5)
rhosq_life = pm.Normal("rhosq_life", T/2, T)
# hyperprior for total cumulative revenue effect
etasq_rev = pm.Normal("etasq_rev", 0, 5)
rhosq_rev = pm.Normal("rhosq_rev", 20, 10)
# mean function hyperprior for long-term calendar effect
mu = pm.Normal("mu", 0, 5)
# linear mean function hyperpriors for lifetime and total revenue effects
lambda_life1 = pm.Normal("lambda_life1", 0, 5)
lambda_rev1 = pm.Normal("lambda_rev1", 0, 5)
# Individual user intercepts
sigma_sq = pm.Normal("sigma_sq", 0, 2.5)
delta = pm.Normal("delta", 0, sd=sigma_sq, shape=N)
# Unidemensional GPPs
alpha_short = pm.gp.Latent(cov_func=etasq_short**2 * gp.cov.ExpQuad(1, ls=rhosq_cal))
alpha_long = pm.gp.Latent(pm.gp.mean.Constant(mu), cov_func=etasq_long**2 * pm.gp.cov.ExpQuad(1, ls=rhosq_cal))
alpha_life = pm.gp.Latent(pm.gp.mean.Linear(coeffs = lambda_life1), cov_func=etasq_life**2 * pm.gp.cov.ExpQuad(1, ls=rhosq_life))
alpha_rev = pm.gp.Latent(pm.gp.mean.Linear(coeffs = lambda_rev1), cov_func=etasq_rev**2 * pm.gp.cov.ExpQuad(1, ls=rhosq_rev))
# Additive GPP
alpha = alpha_short + alpha_long + alpha_life + alpha_rev
# Latent function
f = alpha.prior("f", X=gpp_data[["t", "t", "l", "total_revenue"]].as_matrix())
# Latent function + individual intercept
y = f + delta[user_id]
# Logistic likelihood
likelihood = pm.Bernoulli("likelihood", pm.invlogit(y), observed = gpp_data["y"].values)
trace = pm.sample(2000)
This produces an error:
ValueError: Input dimension mis-match. (input[0].shape[1] = 5789283, input[1].shape[1] = 4)
when trying to add the mean functions together. When I remove all the mean functions (i.e. assume they’re all zero), I get a MemoryError.
So 1. Curious what I’m doing wrong re: mean functions, and 2. Curious if I’m supplying incorrect, or unnecessarily large, data to the model, because 57mm rows for 10k users seems a little ridiculous. Answering the second question requires reading up on the model, so I understand if that’s off-topic for this discourse.