Oh, OK, duh, I see: the system is saving a full array of topic x word probabilities for each entry in the trace, or about 30 MB/sample.
As a Python programmer, I know the value of everything, and the cost of nothing!
Keeping the full trace in memory is obviously never going to work for something as big as this.
Still curious about the Dirichlet priors, though.