Chinese Restaurant Process Clustering using Pólya's urn scheme

I’ve written a CRP clustering model using Pólya’s urn scheme as described in this tutorial, and the DP Mixtures tutorial from the docs.

I think I’ve got the model down right at least in theory, but I don’t see any mixing. So obviously I’m still doing something wrong.

crpc-mix

The notebook can be found here.

I’d appreciate inputs on how to make this model work – a good starting place might be by figuring out a way to omit the max_tables by introducing pm.Potential, perhaps (?)

Thanks.

I came up with the following (working) code:

class CRPClustering(object):
    def __init__(self, size):
        self.size = size
        self.n_tables = 0
        self.table_assignments = T.zeros(size, dtype=int)
        self.p0 = np.random.random(self.size)
    
    def chinese_restaurant_process(self, alpha, max_tables=10):
        if self.size < 1: return None
        
        for customer in range(self.size):            
            if 1. * alpha / (alpha + customer) > self.p0[customer]:
                self.table_assignments = self.choose_unoccupied(customer)
            else:
                self.table_assignments = self.choose_occupied(customer)
        
        return T.extra_ops.to_one_hot(self.table_assignments, max_tables)
    
    def choose_unoccupied(self, customer):
        self.n_tables += 1
        return T.set_subtensor(self.table_assignments[customer], self.n_tables - 1)

    def choose_occupied(self, customer):
        p = np.unique(self.table_assignments[:customer].eval(), return_counts=True)[1] / customer
        random_assignment = np.random.choice(self.n_tables, p=p)
        return T.set_subtensor(self.table_assignments[customer], random_assignment)

I had previously assumed theano.tensor.set_subtensor is an in-place operation, without referring to the docs first.

Although now I need help in figuring out how to parameterize alpha as before (I had to assign a fixed value for this to work). I tried re-writing everything above in theano, but I kept running into errors because using theano.tensor.switch does not evaluate lazily. Worse still, theano.ifelse.ifelse also does not evaluate lazily when test values are involved as with PyMC3.