The machine has >>99% of memory allocated (out of 244GB) and is spending nearly all its time swapping to disk. I have a hard time believing that a basic demo is so memory intensive, and am wondering if there may be a memory leak in the recent release of PyMC3?
The behavior is that memory allocation steadily grows until about 20% complete, at which point it reaches >99% of memory and starts to swap. It doesn’t appear that the allocation is happening all up front, but rather is continuous during execution.
I can’t get it to work under any circumstances. The behavior is always the same: it allocates approximately 100MB of additional memory per second, and allocated memory keeps growing until it exceeds the machine’s physical capacity, at which point the process degrades completely due to memory swapping. I have tried 61GB, 122GB and 244GB machines; all produce the same outcome except that the larger memory instances last for a longer time until they start to thrash.
It is dying during the initialization process using ADVI (v3.1) or jitter+adapt_diag (v3.2) – usually about 20% finished depending on machine specs.
Ubuntu 16.04, Python 3.5.2, Theano 0.9.0
I have tried all combinations of the following:
PyMC3 v3.2 and v3.1
I would love to use PyMC3 for a computing project, but can’t proceed with any confidence unless I can find a way to complete runs with a reasonably available quantity of memory.
I would appreciate any insight you might have. Is there a known good configuration available within AWS (i.e. a combination of operating system, Python version, PyMC3 version, etc)?
Thanks for reporting the solution. Really puzzling that memory is managed so differently on AWS Linux. This probably makes it a Theano issue, but definitely open an issue (or comment on the one you dug up) of your findings.