Bayesian Classifier Combination Model (need help)

Jev · August 12, 2017, 10:48am

Hello, PyMC3 community!

I’ve been trying to implement independent bayesian classifier combination according to Kim, Ghahramani (2012). [Figure 1, see below]

stac

Sample class probabilities from Dirichlet, with hyper parameter nu
Sample class labels array from Multinomial, I’ve picked to use Categorical
Sample sample parameters alpha from exponential, given hyper parameter lambda
Sample classifier confusion matrices from dirichlet
Generate classifier classes given their confusion matrices conditioned on a sample of label array t at instance i.

At first I was not able to index confusion matrices pi, given the value of the target t, as below:

I = 100 # number of instances
J = 3 # number of classes 
K = 4 # number of classifiers

nu_vec = np.array([1. for _ in range(J)], dtype=np.float32)
lambda_mat = np.array([[.3, 3., 2.], [2., .5, 3.], [2., 3., .5]], dtype=np.float32)

IBCC = pm.Model()

with IBCC:

    p_vec = pm.Dirichlet(name='p_vec', a=nu_vec)

    t = pm.Categorical(name='t', p=p_vec, shape=I)

    alpha = [pm.Exponential('alpha_{}'.format(k), lam=lambda_mat, shape=(J, J)) for k in range(K)]

    pi = [[pm.Dirichlet('pi_j{}k{}'.format(j, k), alpha[k][j], shape=(3)) for j in range(J)] for k in range(K)]

    c = [[] for k in range(K)]

    for i in range(I):

        for k in range(K):

            c[k].append(pm.Categorical('c_k{}i{}', p=pi[k][t[i]], observed=c_train[k][t[i]]))

The code above, was giving me an error: TypeError: list indices must be integers or slices, not FreeRV.

I attempted to solve that by inserting a t_sample = t.random() line inside the model, like this:

IBCC = pm.Model()

with IBCC:
    
    p_vec = pm.Dirichlet(name='p_vec', a=nu_vec)

    t = pm.Categorical(name='t', p=p_vec, shape=I)
    
    alpha = [pm.Exponential('alpha_{}'.format(k), lam=lambda_mat, shape=(J, J)) for k in range(K)]
    
    pi = [[pm.Dirichlet('pi_j{}k{}'.format(j, k), a=alpha[k][j], shape=3) for j in range(J)] for k in range(K)]
    
    c = [[] for k in range(K)]
    
    t_sample = t.random()
    
    for i in range(I):
        
        for k in range(K):

            c[k].append(pm.Categorical('c_k{}i{}'.format(k, i), p=pi[k][t_sample[i]], observed=c_train[k][i]))

which, seemed to work. However, I am not sure if that is a correct way to do it?

I followed by doing inference:

with IBCC:
    step1 = pm.sampling.CategoricalGibbsMetropolis([t]+[c[k][i] for i in range(I) for k in range(K)])
    step2 = pm.sampling.Metropolis([alpha[k] for k in range(K)] + [pi[k][j] for j in range(J) for k in range(K)]+
                                  [p_vec])
    trace = pm.sample(50000, step=[step1, step2])

which continues, to work however in the end does not produce the correct results. I have also tried to use default pm.sample() without specifying the samplers, it did not produce correct results either. Both cases, in the end estimate equal probabilities for all labels for every instance. Would appreciate any suggestion or help! Thank you!

junpenglao · August 15, 2017, 7:47am

Hi @Jev,

Using list comprehension to create pymc3 variables are highly inefficient, if it is necessary you can do newvar = tt.stack([list]) to compress the list into a theano RV.

It is a bit unclear to me what the observation would be in your model, is it [K, I]? Also could you please detail the shape of the variables on the 5 steps above?

Jev · August 16, 2017, 8:55pm

Hello @junpenglao,

Thanks for your response and advice!

The shapes of the variables within the steps above are:

Dirichlet sample of class probabilities [1,]
Sample of class labels t [I,]
Hyper parameter lambda [J, J]
Sample of alpha [K, J, J]
Confusion matrices sample [K, J, J]
Class labels generated by classifiers [I, K]

I suppose using the T.tensor.stack([]) function the code would look like this:

IBCC = pm.Model()

with IBCC:
    p_vec = pm.Dirichlet(name='p_vec', a=nu_vec)

    t = pm.Categorical(name='t', p=p_vec, shape=I)

    alpha = [pm.Exponential('alpha_{}'.format(k), lam=lambda_mat, shape=(J, J)) for k in range(K)]

    alpha = T.tensor.stack(alpha)

    pi = [[pm.Dirichlet('pi_j{}k{}'.format(j, k), a=alpha[k][j], shape=3) for j in range(J)] for k in range(K)]

    pi = T.tensor.stack(pi)

    c = [[] for k in range(K)]

    t_sample = t.random()

    for i in range(I):
    
          for k in range(K):

                c[k].append(pm.Categorical('c_k{}i{}'.format(k, i), p=pi[k][t_sample[i]], observed=c_train[k][i]))
    
    c = T.tensor.stack(c)

dooxan · February 22, 2018, 11:27pm

Hi @Jev,

I was also interested in IBCC implementation in pymc3 and have made it to work by following the example for Dawid-Skene implementation from the docs and also reading about hierarchical models in pymc3 in this blog. Main problem here is that models should stay vectorized, which means data needs to be transformed (Massaging your data section from mentioned blog post). In your case it could be something like this:

I = 100 # number of instances
J = 3 # number of classes 
K = 4 # number of classifiers

# create data triplets
kk = list()  # classifiers IDs
ii = list()  # instances IDs
y = list()   # response

# initialize majority votes (useful later on for stability of sampling if used as `testval` argument)
z_init = np.zeros( I, dtype=np.int64 )

# create data triplets
for i in range( I ):
    ks = list()
    for k in range( K ):
        dat = data[ i, k, : ]
        x = np.where( dat == 1 )[0][0]
        ks.append( x )
        ii.append( i )
        kk.append( k )
        y.append( x )

    # getting maj vote for work item i 
    z_init[ i ] = np.bincount( np.array( ks ) ).argmax()
    
print("Indexing example: instance {} has class {} called by {}. classifier".format(ii[33], y[33], kk[33]))

This requires that your data is a cube of (I,K,J) shape - data variable mentioned above. When this is done, model gets significantly simpler in form and faster to sample from:

nu_vec = np.array([1. for _ in range(J)], dtype=np.float32)
lambda_mat = np.array([[.3, 3., 2.], [2., .5, 3.], [2., 3., .5]], dtype=np.float32)

IBCC = pm.Model()

with IBCC:
    p_vec = pm.Dirichlet(name='p_vec', a=nu_vec)
    t = pm.Categorical(name='t', p=p_vec, shape=I, testval=z_init)
    alpha = pm.Exponential('alpha', lam=lambda_mat, shape=(J,J), testval=lambda_mat)
    pi = pm.Dirichlet('pi', a=alpha, shape=(K,J,J))
    c = pm.Categorical('c', p=pi[kk, t[ii]], observed=y)

I’ve noticed that setting testval for both t and lambda_mat is necessary for convergence. Inference (you don’t need to sample C’s):

with IBCC:
    step1 = pm.sampling.CategoricalGibbsMetropolis(vars=[t])
    step2 = pm.sampling.Metropolis(vars=[alpha, pi, p_vec])
    trace = pm.sample(50000, step=[step1, step2])

and access results for t with:

t_hat = np.zeros( I )
for i in range( I ):
    t_hat[ i ] = np.bincount( trace['t'][:,i] ).argmax()

Hope it solves the problem!

Cheers,
Dusan

Topic		Replies	Views
Mixture of multivariate Bernoullis Questions	11	3268	June 13, 2022
Indexing using a free random variable Questions	26	6420	October 16, 2019
KeyError Setting Mixture Proportions for Mixture Model Questions	26	1387	May 13, 2019
Latent Class Analysis Model (Dirichlet / Categorical Implementation Question) Questions	2	1202	February 10, 2019
Probability estimation in pymc3 Questions	5	772	April 22, 2018

Bayesian Classifier Combination Model (need help)

Related topics