Problem with theano sparse matrix operations

I ran into problems with theano matrix operations. Here’s the code snippet:

#layout_cols = data[layout_ids].fillna(0)   ## this is the numpy dense version
layout_cols = theano.sparse.csc_from_dense(data[layout_ids].fillna(0))

no_layouts = 16
no_products = 100
no_stores = 2

with pm.Model() as quantity_model_1:    
    alpha = pm.Uniform("alpha", lower=0, upper=100, shape=(no_stores,no_products))
    g = pm.Normal('g', mu=0., sd=0.1, shape=(no_layouts,no_products))

    # test = pm.math.matrix_dot(layout_cols,g)  ## this is the dense version
    #test1 = test*product_matrix
    #k = pm.math.sum(test1,axis=1)
    test = TS.structured_dot(layout_cols,g)
    test1 = TS.basic.mul(test,product_matrix)
    k = TS.basic.sp_sum(test1,axis=1)

    # Data likelihood
    vol = pm.NegativeBinomial('vol', mu=k, alpha=alpha[storeno,product], observed=total_quantity)
    # Inference
    trace = pm.sample(10)

The dense version can run, but is extremely slow (~12sec / it) and gets slower and slower with each iteration. I suspect it’s using up all the memories in the matrix storage. Hence trying sparse matrix.

The sparse matrix version is giving me error messages:
ValueError: Cannot compute test value: input 0 (SparseVariable{csc,float64}) of Op StructuredDot(SparseVariable{csc,float64}, g) missing default value.

Any idea how to address this? many thanks.

1 Like

Hmm, what is product_matrix? Is it a theano tensor?

product_matrix is a theano sparse matrix. However, problem occurred before getting there. The first instance of problem happened on

Do you get an error when you do layout_cols.tag.test_value?

AttributeError Traceback (most recent call last)
in ()
----> 1 layout_cols.tag.test_value

AttributeError: ‘scratchpad’ object has no attribute ‘test_value’

Could you try assigning a test_value to it?

layout_cols.tag.test_value = scipy.sparse.csc_matrix(data[layout_ids].fillna(0))
1 Like

Thanks @junpenglao. It successfully got past that step, then the kernel died…

I remember last time I try using sparse matrix there are quite some problem as well - did you try using a smaller matrix just to test whether the code can run?

Yes, it ran successfully with a much smaller model. Sounds like I’ll need to break up the overall model into smaller chunks. Any suggestions or best practices on how to manage lots of models?
For example, the original goal was to build a model for 100 products x 300 stores (i.e. ~30000 combinations). However, I’ll need to run each of these models separately. Which means I’ll end up having some 30,000 models to manage.
Is it possible to dynamically name models and retrieve them at a later point?


Yeah this is a difficult issue. The out-of-box solution is to defined a model with theano.shared input and output, and update the value using {}.set_value() for each model. But with 30,000 model it is still a pain.
Maybe a minibatch+VI approach?