Hi! The short answer is that it may be difficult to sample.
There are certain posteriors that might frustrate gradient based samplers. Locally, NUTS will take a long time because it will take up to 1,024 steps, checking for a U-Turn, and will stop expanding whenever it encounters one. In case it is taking this long, It may be taking 1,024 steps on every iteration. This means
- the sampler is not actually encountering a U-turn, meaning the draws will be more correlated than they could be, and
- it takes a long time (1,024 log_prob evaluations, and 1,024 grad(log_prob) evaluations, more or less).
Sampling typically will get faster as tuning goes along (I think currently in pymc, a new mass matrix is used after 101 tuning draws), but if that takes prohibitively long, you might have to think about
- an alternative strategy to summarize the posterior (optimization, VI, pen and paper),
- changing some priors to more well behaved distributions that are reasonably informative (i.e., make everything normal or half-normal with scales that are like 10, instead of like 1e10),
- changing the model structure to better capture how the data were generated
Sorry this isn’t an easy answer!