I’m working with Dominik on this project; if it would be useful, I’d like to highlight that the failure point is accessing the shape attribute of the sparse matrix. This error can even by reproduced in Theano by:
import numpy as np
import scipy.sparse as sp
import theano
import theano.sparse
N = 1_000_000
density = 1e-7
theano.config.compute_test_value = 'raise'
rng = np.random.default_rng()
sX = sp.random(N, N, density=density, format='csr', data_rvs=np.ones, random_state=rng, dtype=int)
tX = theano.sparse.basic.as_sparse(sX)
tX.shape
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
/loc/scratch/48760635/ipykernel_36364/1210979029.py in <module>
11 sX = sp.random(N, N, density=density, format='csr', data_rvs=np.ones, random_state=rng, dtype=int)
12 tX = theano.sparse.basic.as_sparse(sX)
---> 13 tX.shape
~/.conda/envs/env_a/lib/python3.8/site-packages/theano/sparse/basic.py in <lambda>(self)
288 return dense_from_sparse(self)
289
--> 290 shape = property(lambda self: tensor.shape(dense_from_sparse(self)))
291 # don't worry!
292 # the plan is that the ShapeFeature in tensor.opt will do shape propagation
~/.conda/envs/env_a/lib/python3.8/site-packages/theano/graph/op.py in __call__(self, *inputs, **kwargs)
251
252 if config.compute_test_value != "off":
--> 253 compute_test_value(node)
254
255 if self.default_output is not None:
~/.conda/envs/env_a/lib/python3.8/site-packages/theano/graph/op.py in compute_test_value(node)
128 thunk.outputs = [storage_map[v] for v in node.outputs]
129
--> 130 required = thunk()
131 assert not required # We provided all inputs
132
~/.conda/envs/env_a/lib/python3.8/site-packages/theano/graph/op.py in rval(p, i, o, n)
474 # default arguments are stored in the closure of `rval`
475 def rval(p=p, i=node_input_storage, o=node_output_storage, n=node):
--> 476 r = p(n, [x[0] for x in i], o)
477 for o in node.outputs:
478 compute_map[o][0] = True
~/.conda/envs/env_a/lib/python3.8/site-packages/theano/sparse/basic.py in perform(self, node, inputs, outputs)
913 out[0] = x
914 else:
--> 915 out[0] = x.toarray()
916 assert _is_dense(out[0])
917
~/.conda/envs/env_a/lib/python3.8/site-packages/scipy/sparse/compressed.py in toarray(self, order, out)
1037 if out is None and order is None:
1038 order = self._swap('cf')[0]
-> 1039 out = self._process_toarray_args(order, out)
1040 if not (out.flags.c_contiguous or out.flags.f_contiguous):
1041 raise ValueError('Output array must be C or F contiguous')
~/.conda/envs/env_a/lib/python3.8/site-packages/scipy/sparse/base.py in _process_toarray_args(self, order, out)
1200 return out
1201 else:
-> 1202 return np.zeros(self.shape, dtype=self.dtype, order=order)
1203
1204
MemoryError: Unable to allocate 7.28 TiB for an array with shape (1000000, 1000000) and data type int64
In the traceback, which is equivalent to the second half of the traceback above, there is a comment that explains that the shape of a sparse matrix is accessed by creating the dense matrix and then returning the shape of the dense matrix, but the ShapedFeature optimizer is meant to replace this call. So the error is that in tt.grad, which is called by gradient1, the shape of the sparse matrix is accessed in a way that doesn’t provide an opportunity for the ShapedFeature optimizer to replace the call to dense_from_sparse, leading to the creation of a n x n zeros matrix, just to access the shape.