I dont have a lot of experience with large dataset, maybe you can try profiling the memory use:
http://deeplearning.net/software/theano/tutorial/profiling.html
http://docs.pymc.io/notebooks/profiling.html
theano.config.profile = True
theano.config.profile_memory = True
model.profile(model.logpt).summary()