I ran into problems with theano matrix operations. Here’s the code snippet:
#layout_cols = data[layout_ids].fillna(0) ## this is the numpy dense version
layout_cols = theano.sparse.csc_from_dense(data[layout_ids].fillna(0))
no_layouts = 16
no_products = 100
no_stores = 2
with pm.Model() as quantity_model_1:
alpha = pm.Uniform("alpha", lower=0, upper=100, shape=(no_stores,no_products))
g = pm.Normal('g', mu=0., sd=0.1, shape=(no_layouts,no_products))
# test = pm.math.matrix_dot(layout_cols,g) ## this is the dense version
#test1 = test*product_matrix
#k = pm.math.sum(test1,axis=1)
test = TS.structured_dot(layout_cols,g)
test1 = TS.basic.mul(test,product_matrix)
k = TS.basic.sp_sum(test1,axis=1)
# Data likelihood
vol = pm.NegativeBinomial('vol', mu=k, alpha=alpha[storeno,product], observed=total_quantity)
# Inference
trace = pm.sample(10)
The dense version can run, but is extremely slow (~12sec / it) and gets slower and slower with each iteration. I suspect it’s using up all the memories in the matrix storage. Hence trying sparse matrix.
The sparse matrix version is giving me error messages:
ValueError: Cannot compute test value: input 0 (SparseVariable{csc,float64}) of Op StructuredDot(SparseVariable{csc,float64}, g) missing default value.
Any idea how to address this? many thanks.
1 Like
Hmm, what is product_matrix
? Is it a theano tensor?
product_matrix is a theano sparse matrix. However, problem occurred before getting there. The first instance of problem happened on
Do you get an error when you do layout_cols.tag.test_value
?
AttributeError Traceback (most recent call last)
in ()
----> 1 layout_cols.tag.test_value
AttributeError: ‘scratchpad’ object has no attribute ‘test_value’
Could you try assigning a test_value to it?
layout_cols.tag.test_value = scipy.sparse.csc_matrix(data[layout_ids].fillna(0))
1 Like
Thanks @junpenglao. It successfully got past that step, then the kernel died…
I remember last time I try using sparse matrix there are quite some problem as well - did you try using a smaller matrix just to test whether the code can run?
Yes, it ran successfully with a much smaller model. Sounds like I’ll need to break up the overall model into smaller chunks. Any suggestions or best practices on how to manage lots of models?
For example, the original goal was to build a model for 100 products x 300 stores (i.e. ~30000 combinations). However, I’ll need to run each of these models separately. Which means I’ll end up having some 30,000 models to manage.
Is it possible to dynamically name models and retrieve them at a later point?
thanks
Yeah this is a difficult issue. The out-of-box solution is to defined a model with theano.shared
input and output, and update the value using {}.set_value()
for each model. But with 30,000 model it is still a pain.
Maybe a minibatch+VI approach?