To clarify, I’m not sure the sort is a good strategy because NUTS may not love the discontinuities in the gradient (if it ever finds them after tuning). A scaled cumsum of Uniform/Dirichlet may be just fine.
In general it’s hard to find a forward constraining transform that’s equivalent to parameter transform+jacobian. For instance the sort won’t give the same results if the uniform prior is not IID.
Either way it can be hard to think about these sorts of priors, specially when they mix dimensions. We are assigning them some well understood prior densities but the implied parameters can be hard to grok.
Like if you assign a normal density to a log transformed variable it will certainly never behave very “normally” :). And similarly, ordered variables will often not look anything like the original independent densities (and they don’t here).
I usually prefer to do the forward transform like @jessegrabowski illustrated (also because it’s much more natural in PyMC) and rely on prior predictive to understand its implications after constraining. I’m too lazy to start from the constrained prior and work backwards. It also may matter little if the likelihood swamps the prior anyway.
The forward approach is also how I would go about generating fake data, so the model doesn’t end up looking too absurd to me (there’s a circularity here, if I used mcmc to generate fake data I might be of the opposite opinion)