Hi everyone,
Apologies if this question was answered previously - if so, I would appreciate any further reading and references! I did a thorough search both here and elsewhere, and I believe there is scarce information for beginners on Bayesian multiple logistic regression end-to-end workflows.
Below is my code for model checking and relevant dependencies using both PyMC and ArviZ. The model itself runs perfectly well after reviewing the trace plot, summary, etc. However, when attempting to plot the prior and posterior predictive checks, I think I am either A) totally misunderstanding how to interpret the plots or B) experiencing user error in generating the correct or intended plots.
If the former, could you please help me understand how to make sense of the plots? If the latter, is there an error with my code and / or what is the correct way to conduct these model checks for Bayesian multiple logistic regression? Additionally, what other follow-up procedures do you recommend for this type of analysis?
Thank you in advance for your time and support!
# Define the Bayesian multiple logistic regression model
with pm.Model() as model:
a = pm.Cauchy('intercept', alpha = 0, beta = 10)
b1 = pm.Cauchy('x1', alpha = 0, beta = (2.50 / (2 * 0.50)))
b2 = pm.Cauchy('x2', alpha = 0, beta = (2.50 / (2 * 0.50)))
b3 = pm.Cauchy('x3', alpha = 0, beta = (2.50 / (2 * 0.50)))
b4 = pm.Cauchy('x4', alpha = 0, beta = (2.50 / (2 * 0.50)))
b5 = pm.Cauchy('x5', alpha = 0, beta = (2.50 / (2 * 0.50)))
b6 = pm.Cauchy('x6', alpha = 0, beta = (2.50 / (2 * 0.50)))
# Gelman et al. 2008 DOI: 10.1214/08-AOAS191
mu = pm.invlogit(a + b1 * df['x1'] + b2 * df['x2'] +
b3 * df['x3'] + b4 * df['x4'] +
b5 * df['x5'] + b6 * df['x6'])
pm.Bernoulli('logit', p = mu, observed = df['y'])
# Run the Hamiltonian Monte Carlo method
with model:
trace = pm.sample(draws = 1000, tune = 1000)
# Increase both 'draws' and 'tune' for true analysis
# Evaluate prior and posterior predictive checks
prior_pc = pm.sample_prior_predictive(samples = 10)
# Increase 'samples' for true analysis
az.plot_ppc(prior_pc, observed = True, colors = ['coral',
'lightskyblue', 'slategrey'], group = 'prior')
post_pc = pm.sample_posterior_predictive(trace = trace,
progressbar = True)
az.plot_ppc(post_pc, observed = True, colors = ['coral',
'lightskyblue', 'slategrey'], group = 'posterior')