Building proper hierarchical dirichlet process

Hey everybody,
I recently did some research on hierarchical models in pymc and how to implement them. But what was interesting is that I haven’ t found one model with several levels that seems to have created valid results. On many blogs and in papers where approaches to build models to describe normal distributed data but in every single one there were serious warnings during sampling: rhat is greater than 1.4, effective samples smaller than 200, great amount of divergences, etc. (here is one example: GLM: Hierarchical Linear Regression — PyMC3 3.10.0 documentation). The authors simply ignored these warnings but I think they are a strong indicator that the results might be biased. So I set up a model for myself using the conjugate prior of the normal distribution, the normal-inverse-gamma distribution or the LKJ (depending on the data) with both variable mean and variance, and ran some samplings. Eventually I wasn’ t able to build a model that would be able to exceed the existing models. This created some questions I couldn’ t answer so far.
So I decided to ask you if anyone has experience regarding this topic and the performance of NUTS-sampling in general. I have read of some troubles that pymc has with hierarchical models. The maximum amount of levels for a stable sampling seems to be three levels and multi-modal posteriors appear to cause sampling problems. Another problem are centred distributions (there are interesting papers how to change a centred normal distribution to a non-centred).
So what do you think, does pymc have problems building stable hierarchical models? Or does it only have problems with conjugate-prior models? Or are the existing approaches even simply badly parametrized?
My approach for a hierarchical model for a conjugate prior would be something like this:

x_i|\mu,\tau\sim N(\mu,\tau) i.i.d.
\mu|\tau \sim N(\mu_0,n_0\tau)
\tau \sim Ga(\alpha,\beta)

With the hyperparameters being drawn a level below.

Looking forward to answers!
Leon