You are right - the tuning of NUTS is also recently changed so it might give different behaviour compared to 3.0. If you are sure you can set the tune=0 with init='advi', then you will be using the mass matrix estimated from advi for NUTS (warning: ADVI is known to underestimated variance for parameters, and in many cases the wrong estimation for parameters).
As for the prior choice, I agree with you that in many application (e.g., Psychology or behavioural science) where your design matrix is experiment manipulation, and you don’t want to do feature selection, a weakly informed prior is usually the way to go. In this case, you are essentially saying “all the parameters are of interested and I want to put more weight on the data as much as possible”.
I am not sure what you mean about “weight the prior distribution”, but if you meant having different prior distribution and weighted them afterward, you can try model averaging in PyMC3 (fitting different model first, and then do prediction using a weighted version of all the fitted model).