Are you sure you want to do significant testing, at a level of 0.05? Half (at least) of the people here will scream in terror reading that part 
There are many ways to do model comparison, and the current recommendations are prediction based (Posterior Predictive Checks pm.sample_ppc, cross-validation pm.loo). You can see more information here and here.
At the end, it kind of comes down to what are you going to do with the selected model. If you are going to use it for prediction, then there are better way to do so (model averaging). If you want to say thatโs the model explaining the best of your data, then you can just describe the one with the least prediction error.