Thank you again @junpenglao for all the feedback!
I checked using trace = pm.sample() but NUTS is still very slow. In the NUTS initialization, it does not mention ADVI at all, like I saw in some examples. Is there a way to force the initialization to be carried by ADVI? I think this problem in related. There they find that MAP and ADVI initial values are very different.
About posterior correlation, I was thinking about what you post and I think you are completely right.
In summary, when I call trace = pm.sample() alone, NUTS is very slow probably because ADVI is not called.
Thank you again!
PS:
I don’t follow here, please provide some reading material in order to understand how and where NUTS mass matrix is stored.