Right—that’s always the case with censoring. I haven’t tried this particular model at scale with NUTS, but with something like a negative binomial model, I’ve often found it more efficient to reparameterize with a Poisson with a latent gamma random effect, even when that adds 100K or more extra parameters. \mathcal{O}(D^{1/4}) scaling is pretty great in practice as well as in theory.
The geometry’s usually much more of a problem. Like if you have a few hundred spatial random effects in a Poisson model, but they’re spatially smoothed so that you have a hierarchical model. The geometry there can be much more challenging to sample than adding more parameters in my experience.
One of the serious drawbacks to letting MCMC do the integration for you is overall output size for the sample of MCMC draws. It’s not only slow in disk and memory just for I/O, if you save all the random effects and try to do something like pass to ArviZ, it’ll take a long time to calculate all the summary statistics. And if you try to be super picky about R-hat thresholds, you might never be happy with the results of 200K multiple comparisons.
One advantage of the random effect approach is that it’s a bit more flexible, especially if you have multilevel structure that’s not strictly hierarchical in the random effects case or more complicated than normal densities in the truncated case. So it’s usually just simpler to use the random effect approach. Another advantage is that it doesn’t rely on numerical (log) cdfs. In Stan, these are some of our least stable functions when differentiating against parameters because we just have the traditional cdf implementations and take their log—we should really have custom log cdf iterative algorithms for both values and gradients, but tha’s way beyond our person-power and expertise. So we just autodiff through taking the log of the cdf. We couldn’t find better implementations, so I’m wondering if you did for PyMC? Especially for log cdfs. I should really be asking this in a separate topic.