Running with minibatches (memory constraints)

Ahhhhh, I know what’s happening now!

The casting of the sparse matrix is in the “build_pymc3_model” function that I was running.

self.doc_t_minibatch = pm.Minibatch(self.wordCounts.toarray(), minibatchSize)
self.doc_t = shared(self.wordCounts.toarray()[:minibatchSize], borrow=True)

wordCounts gets expanded inside the function, and that’s why the RAM is exploding. I should only cast the minibatch to dense as it’s needed. I’m doing it all and then setting indexes for the minibatches on that. Nothing to do with how PyMC3/minibatches work, just my own stupidity.

Thanks a lot for your help! Would have taken me a lot longer to realise without you :slight_smile:

1 Like