Using sparse matrices as observed in DensityDist


#1

Hi,

I am working on a generative model on Bernoulli Mixture model. I came up with a model but I have trouble using my data in sparse form while calculating log-likelihood.

with dir_model:

pi = pm.Dirichlet('pi', a = np.ones(K), shape = K)

dri = pm.Dirichlet('dri', a = np.ones((K, B)), shape = (K, B))
 
vector_U = pm.DensityDist('vec_u', ll_bern(dri, pi), observed = observed_d)

def ll_bern(dri, pi):
    def logp_(value):
        logps = [tt.log(pi[i]) + tt.sum(tt.log(value*dri[i,:] + (1-value)*(1 - dri[i,:])),axis=1) for i in range(K)]
        return tt.sum(logsumexp(tt.stacklists(logps)[:, :value.shape[0]], axis=0))
    return logp_ 

observed_d is the data in csr sparse format. The model works fine without sparse on a synthetic dataset but with original dataset I have to make it somehow work with sparse because of its large nature.

Help much appreciated.


Sparse observed tensor?
#2

Sparse matrix is basically a tuple with two index and 1 value: (x, y, value). So one workaround I can think of is you index into ll_bern(dri, pi)[x, y] using the index from the sparse observed_d, and just observed the value of observed_d.