It is possible. NUTS works perfectly when the data provides for a unimodal posterior, but I have one particular important (real clinical data) dataset that happens to give two modes with approximately the same mass. What happens is that each chain gets poorly initialized and the sampling is inefficient even after relatively long tuning, I suppose this is because in this multimodal situation ADVI estimates something that is neither the first nor the second mode.
Just to clarify: This is for an academic article. It is not a major issue, I already have the answer from emcee, but in the article that I’m writing it is sort of stupid to write that I used one sampler for this and another for that, so I want to find something robust and fast that can be cited. This is the “proverbial” 90% of the work that are going to bring about 10% of the results.
I can show you my model and you will see why NUTS is struggling (I’m really abusing it), but you have to confirm that you have spare time and interest. Because explaining what is happening in the model will take quite some of my time which is scarce ATM.
As for the question in the topic I’m glad that I tried SMC and DEMetropolis , thanks to you (!), and (sort of) understood that they are a dead-end.