How to marginalize latent discrete variables in CDMs (cognitive diagnosis models)?

The code I rewrote is as follows

# ideal response for all student categories
all_c_eta = (all_class.reshape(-1,1,K)>=Q.reshape(1,I,K)).prod(axis=-1)
all_c_eta = tt.as_tensor(all_c_eta)
with pm.Model() as DINA:
    # number of categories is 8
    C = 8
    # students categories, shape=(N,C)
    student_class_p = pm.Dirichlet("student_class_p",tt.ones(C),shape=(N,C))
    
    # repeat all_c_p I times and transform the shape of all_c_p as (N,I,C) 
    # the possibility in category c
    w = tt.repeat(student_class_p,I).reshape((N,C,I)).transpose(0,2,1)
    
    s_sample = pm.Beta("s",alpha=1,beta=1,shape=(1,I))
    g_sample = pm.Beta("g",1,1,shape=(1,I))
    
    # correct possibility of all possible categories, shape=(C,I)
    all_c_p = pm.Deterministic("all_class_p",all_c_eta*(1-s_sample-g_sample)+g_sample)
    
    # repeat all_c_p N times and transform the shape of all_c_p as (N,I,C)
    # correct possibility 
    n_i_c = tt.repeat(all_c_p,N).reshape((C,I,N)).transpose(2,1,0)
    compond_dist = pm.Bernoulli.dist(p=n_i_c,shape=(N,I,C))
    
    pm.Mixture("Y",w=w,comp_dists=compond_dist,observed=Y)
    # Y.shape=(N,I)

But judging from the results, there seems to be something wrong. :thinking:
The correct classification rate is 78.7%, but it should be closer to 85.3%( directly sample from potential discrete variables ).

Thanks for your help and time :grin:~