I am trying to predict the rate of success from a dataset containing trials and successes.
At the moment the model is parametrised as following:
with pm.Model() as betabinomial:
predictor = pm.Data('predictor', df.predictors)
trials = pm.Data('trials', df.trials)
intercept = pm.Normal('intercept', mu=np.log(0.1), sd=0.001)
beta = pm.Normal('beta', mu=np.log(0.8), sd=0.001)
p_sigma = pm.Exponential('p_sigma', 100)
# Expected value
p_mu = pm.math.invlogit(intercept + beta * predictor)
p_kappa = ((p_mu * (1-p_mu))/p_sigma**2)-1
p_alpha = p_mu * p_kappa
p_beta = (1-p_mu) * p_kappa
# Outcome definition
Y = pm.BetaBinomial('successes', n=impressions, alpha=p_alpha, beta=p_beta, observed=df.successes)
I would like to estimate the parameter p from the Beta distribution on a holdout dataset but I cannot figure out how to do it. I am happy to also separate the beta binomial into a beta prior + a binomial likelihood if it makes it any easier.
Below was my attempt but I was not able to make predictions because the shape of the beta is fixed to the size of the train dataset and sample_posterior_predictive fails if the test dataset is smaller.
with pm.Model() as beta_binomial:
predictor = pm.Data('predictor', df.predictor)
trials = pm.Data('trials', df.trials)
intercept = pm.Normal('intercept', mu=np.log(0.1), sd=0.001)
beta = pm.Normal('beta', mu=np.log(0.8), sd=0.001)
p_sigma = pm.Exponential('p_sigma', 100)
# Expected value
p_mu = pm.math.invlogit(intercept + beta * predictor)
p_kappa = ((p_mu * (1-p_mu))/p_sigma**2)-1
p_alpha = p_mu * p_kappa
p_beta = (1-p_mu) * p_kappa
# Data likelihood
p = pm.Beta('p', alpha=p_alpha, beta=p_beta, shape=df.shape[0])
# Outcome definition
Y = pm.Binomial('successes', n=impressions, p=p, observed=df.successes)
Any suggestion would be greatly appreciated, couldn’t find any suggestion on the Web.