Multi-class BART Model Assistance

dv_mepi · July 6, 2022, 1:43am

Hello,

I’ve been looking into various use-cases for BART models and in particular have read through the Ch7 example in Bayesian Modeling and Computation in Python as well as the similar example notebook. I also looked at this binary classification example.

I was having trouble getting a binary classification model to work because I was unable to recover the labels after pmx.bart.predict for an out-of-sample (hold-out) data set.

My data consists of five (5) variables and a binary response variable.

import pymc as pm
import pymc_experimental as pmx
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y_vals, test_size=0.25, random_state=42, stratify= y_vals)

with pm.Model() as bart_model:
    μ = pmx.BART("μ", X= X_train, Y= y_train, m= 50)
    θ = pm.Deterministic('θ', pm.math.sigmoid(μ))
    y = pm.Bernoulli("y", p= θ, observed= y_train)

with bart_model:
    trace = pm.sample()
    trace.extend(pm.sample_posterior_predictive(trace))
    predtest = pmx.bart.predict(trace, rng, X= X_test)

After I use pmx.bart.predict on the out-of-sample data, the recovered values look like:

array([-6.93663277, -8.30200875, -7.18379557, ..., -5.97973494,
       -6.49198275, -7.58192513])

However, after I modify the pmx.bart.predict source code with return sigmoid(pred), the result is more like I would expect:

array([0.00398453, 0.0002981 , 0.00223829, ..., 0.00137673, 0.00213677,
       0.00066914])

Moving on to my actual application, I have again have a dataset with five (5) variables and a response variable with five classes array([0, 1, 2, 3, 4, 5]. So, I tried to extend the above to a multi-class problem using a softmax function.

with pm.Model() as bart_model:
    μ = pmx.BART("μ", X= X_train, Y= y_train, m= 50)
    θ = pm.math.softmax(μ)
    y = pm.Categorical("y", p= θ, observed= y_train)


with bart_model:
    trace = pm.sample()
    trace.extend(pm.sample_posterior_predictive(trace))

Unfortunately, when I check the posterior_predictive, it doesn’t return the ‘y’ class values. Instead, It returns an array of single y values that are obviously not the classes or softmax output.

trace.posterior_predictive.y.values

array([[[3841, 2703, 4008, ..., 2289,  426, 3936],
        [6172, 1810, 7215, ..., 1460, 3219, 4301],
        [3542, 3660, 2105, ..., 5984,  230, 5654],
        ...,
        [7081, 2198, 5241, ..., 5605, 7490, 4982],
        [7121, 6629, 4655, ..., 2991, 7219, 1199],
        [ 975, 4058, 3640, ..., 5188, 6736, 6479]],
        ...,

And the subsequent prediction on hold-out data:

with bart_model:
    predtest = pmx.bart.predict(trace, rng, X= X_test)

Returns an array with results that do not appear to have any class structure.

array([[[-58.23271276],
        [-58.79261905],
        [-56.67507199],
        ...,
        [-57.04217796],
        [-62.01896868],
        [-56.81891283]]])

Any help here would be greatly appreciated! Thanks in advance.

ricardoV94 · July 6, 2022, 8:41am

CC @aloctavodia

aloctavodia · July 6, 2022, 12:45pm

We have recently added a shape/size argument to BART. You can do something like:

    μ = pmx.BART("μ", X= X_train, Y= y_train, m= 50, shape=(num_categories, num_observations))

dv_mepi · July 6, 2022, 1:28pm

Thank you - I tried the below, but it did not work.

with pm.Model(coords=COORDS) as bart_model:
    μ = pmx.BART("μ", X= X_train, Y= y_train, m= 50, shape=(6, 7500))
    #θ = pm.Deterministic('θ', pm.math.softmax(μ))
    θ = pm.math.softmax(μ)
    y = pm.Categorical("y", p= θ, observed= y_train)

Which threw a really long error summarized with:

AssertionError: Could not broadcast dimensions

aloctavodia · July 6, 2022, 1:48pm

Are you sure you have the last version of pymc-experimental? I tried with a silly example and it works, in my example y_train has shape (100,2) and Bart shape (2, 100). I will try with a more realistic example, it will be a good exercise to see how BART works with categorical distributions.

dv_mepi · July 6, 2022, 1:52pm

I’m still fairly new to this so I’m sure it’s a shape issue on my side (I installed pymc_experimental this weekend). My y_train shape is (7500, ) but has 6 different integer categories. I think I need to encode y_train.

aloctavodia · July 6, 2022, 3:31pm

The following example seems to work.

X = np.linspace(0, 1, 120)[:,None]
Y = np.repeat([0, 1, 2], 40)
plt.plot(X[:,0], Y, ".")

cat

with pm.Model() as bart_model:
    μ = pmx.BART("μ", X=X, Y=Y, m=50, shape=(3, 120))
    θ = pm.Deterministic('θ', pm.math.softmax(μ, axis=0))
    y = pm.Categorical("y", p=θ.T, observed=Y)
    idata = pm.sample()

Now we check the results

# get the posterior mean of θ
posterior_mean = az.extract_dataset(idata, var_names='θ').mean("sample")
#using the posterior_mean generate posterior_predictive samples (syntethic data)
new_y = [np.random.multinomial(n=1, 
         pvals=posterior_mean.sel({"θ_dim_1":i}).values).argmax() for i in range(120)]

plt.plot(X[:,0], new_y, ".")

cat_post

dv_mepi · July 6, 2022, 8:36pm

Thank you, it is working now with a slight modification of adding θ to posterior_mean.sel({"θ_dim_1":i}).θ.values

posterior_mean = az.extract_dataset(trace, var_names='θ').mean("sample")
new_y = [np.random.multinomial(n=1, pvals=posterior_mean.sel({"θ_dim_1":i}).θ.values).argmax() for i in range(7500)]

I was interested in using predicting out-of-sample data, but using pmx.bart.predict doesn’t seem to be doing a good job with unbalanced classes.

aloctavodia · July 6, 2022, 9:02pm

Right, the difference is that I am using the developmental version of ArviZ. Do you have an example that you could share, even if made up. I have not tested BART with categorical likelihoods, it may need some internal tweak, like what we currently do for binomial data.

Note: Also take into account that pmx.bart.predict is making predictions for the values of μ, so if you want to obtain observations you need first need to apply the softmax function and then the categorical distribution, something like:

pvals = pm.math.softmax(pmx.bart.predict(idata, rng, X=X_new)[0], axis=1).eval()
new_y = [np.random.multinomial(n=1, pvals=pvals[i]).argmax() for i in range(120)]
plt.hist(new_y, alpha=0.5)
plt.hist(Y, alpha=0.5)

cat_pred

Topic		Replies	Views
Need help with setup for BART model for binary classification v5 modeling	0	640	November 14, 2022
Categorical BART with Out of Sample Predictions Sharing prediction , bart	1	119	October 11, 2024
Problem to load trained BART model for prediction v5 bart	5	767	December 6, 2023
Binary classification example using the recently added BART model Questions	3	1550	January 13, 2021
Out of sample predict issue	6	554	June 20, 2023

Multi-class BART Model Assistance

Related topics