Yes I think it’s the number of parameters. I can imagine the computational graph (logp and gradient) easily taking what’s left of your RAM.
Yes I think it’s the number of parameters. I can imagine the computational graph (logp and gradient) easily taking what’s left of your RAM.