Bayesian Classifier Combination Model (need help)

Hello, PyMC3 community!

I’ve been trying to implement independent bayesian classifier combination according to Kim, Ghahramani (2012). [Figure 1, see below]

stac

  1. Sample class probabilities from Dirichlet, with hyper parameter nu

  2. Sample class labels array from Multinomial, I’ve picked to use Categorical

  3. Sample sample parameters alpha from exponential, given hyper parameter lambda

  4. Sample classifier confusion matrices from dirichlet

  5. Generate classifier classes given their confusion matrices conditioned on a sample of label array t at instance i.

At first I was not able to index confusion matrices pi, given the value of the target t, as below:

I = 100 # number of instances
J = 3 # number of classes 
K = 4 # number of classifiers

nu_vec = np.array([1. for _ in range(J)], dtype=np.float32)
lambda_mat = np.array([[.3, 3., 2.], [2., .5, 3.], [2., 3., .5]], dtype=np.float32)

IBCC = pm.Model()

with IBCC:

    p_vec = pm.Dirichlet(name='p_vec', a=nu_vec)

    t = pm.Categorical(name='t', p=p_vec, shape=I)

    alpha = [pm.Exponential('alpha_{}'.format(k), lam=lambda_mat, shape=(J, J)) for k in range(K)]

    pi = [[pm.Dirichlet('pi_j{}k{}'.format(j, k), alpha[k][j], shape=(3)) for j in range(J)] for k in range(K)]

    c = [[] for k in range(K)]

    for i in range(I):

        for k in range(K):

            c[k].append(pm.Categorical('c_k{}i{}', p=pi[k][t[i]], observed=c_train[k][t[i]]))

The code above, was giving me an error: TypeError: list indices must be integers or slices, not FreeRV.

I attempted to solve that by inserting a t_sample = t.random() line inside the model, like this:

IBCC = pm.Model()

with IBCC:
    
    p_vec = pm.Dirichlet(name='p_vec', a=nu_vec)

    t = pm.Categorical(name='t', p=p_vec, shape=I)
    
    alpha = [pm.Exponential('alpha_{}'.format(k), lam=lambda_mat, shape=(J, J)) for k in range(K)]
    
    pi = [[pm.Dirichlet('pi_j{}k{}'.format(j, k), a=alpha[k][j], shape=3) for j in range(J)] for k in range(K)]
    
    c = [[] for k in range(K)]
    
    t_sample = t.random()
    
    for i in range(I):
        
        for k in range(K):

            c[k].append(pm.Categorical('c_k{}i{}'.format(k, i), p=pi[k][t_sample[i]], observed=c_train[k][i]))

which, seemed to work. However, I am not sure if that is a correct way to do it?

I followed by doing inference:

with IBCC:
    step1 = pm.sampling.CategoricalGibbsMetropolis([t]+[c[k][i] for i in range(I) for k in range(K)])
    step2 = pm.sampling.Metropolis([alpha[k] for k in range(K)] + [pi[k][j] for j in range(J) for k in range(K)]+
                                  [p_vec])
    trace = pm.sample(50000, step=[step1, step2])

which continues, to work however in the end does not produce the correct results. I have also tried to use default pm.sample() without specifying the samplers, it did not produce correct results either. Both cases, in the end estimate equal probabilities for all labels for every instance. Would appreciate any suggestion or help! Thank you!

Hi @Jev,

Using list comprehension to create pymc3 variables are highly inefficient, if it is necessary you can do newvar = tt.stack([list]) to compress the list into a theano RV.

It is a bit unclear to me what the observation would be in your model, is it [K, I]? Also could you please detail the shape of the variables on the 5 steps above?

Hello @junpenglao,

Thanks for your response and advice!

The shapes of the variables within the steps above are:

  1. Dirichlet sample of class probabilities [1,]
  2. Sample of class labels t [I,]
    Hyper parameter lambda [J, J]
  3. Sample of alpha [K, J, J]
  4. Confusion matrices sample [K, J, J]
  5. Class labels generated by classifiers [I, K]

I suppose using the T.tensor.stack([]) function the code would look like this:

IBCC = pm.Model()

with IBCC:
    p_vec = pm.Dirichlet(name='p_vec', a=nu_vec)

    t = pm.Categorical(name='t', p=p_vec, shape=I)

    alpha = [pm.Exponential('alpha_{}'.format(k), lam=lambda_mat, shape=(J, J)) for k in range(K)]

    alpha = T.tensor.stack(alpha)

    pi = [[pm.Dirichlet('pi_j{}k{}'.format(j, k), a=alpha[k][j], shape=3) for j in range(J)] for k in range(K)]

    pi = T.tensor.stack(pi)

    c = [[] for k in range(K)]

    t_sample = t.random()

    for i in range(I):
    
          for k in range(K):

                c[k].append(pm.Categorical('c_k{}i{}'.format(k, i), p=pi[k][t_sample[i]], observed=c_train[k][i]))
    
    c = T.tensor.stack(c)

Hi @Jev,

I was also interested in IBCC implementation in pymc3 and have made it to work by following the example for Dawid-Skene implementation from the docs and also reading about hierarchical models in pymc3 in this blog. Main problem here is that models should stay vectorized, which means data needs to be transformed (Massaging your data section from mentioned blog post). In your case it could be something like this:

I = 100 # number of instances
J = 3 # number of classes 
K = 4 # number of classifiers

# create data triplets
kk = list()  # classifiers IDs
ii = list()  # instances IDs
y = list()   # response

# initialize majority votes (useful later on for stability of sampling if used as `testval` argument)
z_init = np.zeros( I, dtype=np.int64 )

# create data triplets
for i in range( I ):
    ks = list()
    for k in range( K ):
        dat = data[ i, k, : ]
        x = np.where( dat == 1 )[0][0]
        ks.append( x )
        ii.append( i )
        kk.append( k )
        y.append( x )

    # getting maj vote for work item i 
    z_init[ i ] = np.bincount( np.array( ks ) ).argmax()
    
print("Indexing example: instance {} has class {} called by {}. classifier".format(ii[33], y[33], kk[33]))

This requires that your data is a cube of (I,K,J) shape - data variable mentioned above. When this is done, model gets significantly simpler in form and faster to sample from:

nu_vec = np.array([1. for _ in range(J)], dtype=np.float32)
lambda_mat = np.array([[.3, 3., 2.], [2., .5, 3.], [2., 3., .5]], dtype=np.float32)

IBCC = pm.Model()

with IBCC:
    p_vec = pm.Dirichlet(name='p_vec', a=nu_vec)
    t = pm.Categorical(name='t', p=p_vec, shape=I, testval=z_init)
    alpha = pm.Exponential('alpha', lam=lambda_mat, shape=(J,J), testval=lambda_mat)
    pi = pm.Dirichlet('pi', a=alpha, shape=(K,J,J))
    c = pm.Categorical('c', p=pi[kk, t[ii]], observed=y)

I’ve noticed that setting testval for both t and lambda_mat is necessary for convergence. Inference (you don’t need to sample C’s):

with IBCC:
    step1 = pm.sampling.CategoricalGibbsMetropolis(vars=[t])
    step2 = pm.sampling.Metropolis(vars=[alpha, pi, p_vec])
    trace = pm.sample(50000, step=[step1, step2])

and access results for t with:

t_hat = np.zeros( I )
for i in range( I ):
    t_hat[ i ] = np.bincount( trace['t'][:,i] ).argmax()

Hope it solves the problem!

Cheers,
Dusan

2 Likes