Would the permutation bijector avoid the mode switching problem?
I think mode switching within chain would affect estimation of the distribution.
But I only want to find the best parameters for the model and don’t need an estimation of the variance now. So I guess the post-processing would work for me.