Slightly tangential question, but how often you you think sampling difficulty arises from multi-modality, as opposed to difficult or degenerate posterior geometry? In my own work (economics) I see the latter much more often than the former, although I admittedly don’t work much with mixture models.
Multimodality is definitely less of an issue than it used to be, since the implementation of features like the ordered transform and mixture classes have made label switching much less common. Variational inference has also been pretty helpful too, since it typically just picks a single mode for the approximate posterior though that’s more or less just shoving the issue under the rug so it doesn’t show up in the convergence diagnostics.