Evaluating posterior predictive for classifying new data

sezer · December 23, 2019, 12:15pm

Hi,

I have seen similar questions asked here and in Stack Exchange, and I think I checked out most of the tutorials, however I still don’t get how to make predictions for new data as in a supervised classification problem.

Examples use sample_posterior_predictive or sample_ppc, but these generate samples from the posterior predictive distribution, isn’t it right? As far as I understand, we need to evaluate the posterior predictive with the estimated parameters and new data (feature vector).

In Bishop’s PRML, the predictive distribution for logistic regression is given as follows:

p(C_1|\phi, \mathbf{t}) = \int p(C_1| \phi, \mathbf{w}) p(\mathbf{w} | \mathbf{t}) d\mathbf{w}

where \phi is the new feature vector, \mathbf{t} is the training labels, and \mathbf{w} is the estimated parameters.

So, instead of sampling this distribution, shouldn’t we evaluate p(t = C_1|\phi, \mathbf{t}) and p(t = C_2|\phi, \mathbf{t})?

A simple explanation for dummies (like me) or a simple classification example that someone could forward me to would be very helpful.

Thank you.

ricardoV94 · December 23, 2019, 1:16pm

You can manually apply the posterior estimates of the beta coefficients to your new data and compare the output to your labels.

For instance:

test_x = [0, 1, 2, 3]
test_y = [0, 0, 1, 1]
test_y_pred = sigmoid(trace['beta0'] + trace['beta] * test_x)
np.mean( (test_y_pred > 0.5) == test_y) # PPC for new data

Topic		Replies	Views
Posterior Predictive distributions: beta-binomial models Questions	1	647	March 10, 2021
How to use the posterior predictive distribution for checking a model from PyMC version agnostic arviz , model-checking	10	4154	March 14, 2023
How to evaluate the performance of a Bayesian classifier? Questions	0	527	May 18, 2020
Could somebody provide a minimal example for sample_posterior_predictive() Questions	2	413	April 15, 2021
Posterior Predictive Checks Questions	1	604	May 19, 2019

Evaluating posterior predictive for classifying new data

Related topics