Getting multinomial class probabilities during posterior prediction on test Data

tstesco · January 31, 2018, 4:42pm

Hello PyMC3 community! I’ve been working on a multinomial class prediction model and have been pulling my hair out trying to find the correct approach to get the posterior class probability when predicting on new (test) data. I know there must be a simple way such as described in this stack overflow post. Also similar to the post on an Observed Deterministic (can’t post another link ) variable.

The difference being I am trying to find pm.Deterministic sample values not from the previously sampled trace, but when the posterior is evaluated on hold-out test data.

Here’s example pseudo code in the same form as my model:

# example data
y_train = [[0,0,1],[0,0,1],[0,1,0]]
x_train = [10, 11, 20]

# set training data as theano shared variables
xt = theano.shared(x_train)
yt = theano.shared(y_train)

with pm.Model() as my_model:
    # variables tuning
    theta_1 = pm.Normal('theta_1', mu=1.25, sd=0.1)
    # deterministic transformation in some function
    class_param = function((xt, theta_1)
    p = pm.Deterministic('p', tt.nnet.softmax(class_param))
    observed = pm.Multinomial('obs', n=1, p=p, observed=yt)
    step = pm.Metropolis()
    trace = pm.sample(draws=3000, step=step)

trace_burnt = trace[2000:]
xt.set_value(x_test)
ppc = pm.sample_ppc(trace_burnt, samples=500, model=my_model)

So in this case I’m trying to find a specific vector [p(C=0), P(C=1), P(C=2)] 500 times sampled for each data point in x_test, instead of getting a multinomial binary prediction vector e.g. [1, 0, 0] 500 times per data point. These are the values of my ‘p’ variable which is a pm.Deterministic.

It seems like I’m just missing something here. Please let me know if there is a simple way to do it, any help at all is really appreciated. Currently I am sampling these binary predictions N times, summing and dividing each by N. Or I am taking confidence intervals with statsmodels.stats.proportion.multinomial_proportions_confint that accepts these binary prediction vectors.

junpenglao · January 31, 2018, 4:55pm

Not sure I understand what you mean here. What would be the expected output in this case?

tstesco · January 31, 2018, 5:00pm

I’m trying to get the respective values of the unobserved deterministic ‘p’ variable, those are probabilities for each class that it passes to the pm.Multinomial variable, For example [0.5, 0.4, 0.1] instead of [1,0,0]. The multinomial variable is doing as it should, but I’m trying to analyze the class probabilities for different inputs also.

junpenglao · January 31, 2018, 5:18pm

Oh I see. Unfortunately, there is no easy way to do it for Deterministic node. You need to select points from the trace (point = trace._straces[chain_idx].point(point_idx)) and pass the posterior sample of theta_1 (point['theta_1']) through the deterministic function (function and tt.nnet.softmax). You might want to replace tt.nnet.softmax with a numpy version to ease the computation.

Tip: If you are sampling another stochastic node that is not observed (for example, if in this example you have instead p=Dirichlet('p', tt.nnet.softmax(class_param))), you can sample its posterior prediction via ppc = pm.sample_ppc(trace_burnt, samples=500, vars=[‘p’], model=my_model)

tstesco · January 31, 2018, 5:29pm

Thanks for giving some insight. I guess its not as simple as I hoped. I’ll try your tips and add anything else I come across.

Topic		Replies	Views
Using Multinomial with an underlying model for the probabilities Questions	5	677	October 23, 2019
Probability estimation in pymc3 Questions	5	772	April 22, 2018
Deterministic and observed RV behaviour when using sample_posterior_predictive Questions	5	1085	January 24, 2019
Deterministic posterior predictive?	5	549	November 14, 2023
Returning posterior predictive probabilities for multiple categories Questions development	2	654	November 20, 2018

Getting multinomial class probabilities during posterior prediction on test Data

Related topics