Discrete random variable and discontinuous function like switch is usually problematic. I suggest trying to reparameterize them into continuous RV and functions, see for example:
How can I marginalized a model with latent discrete variables?
I am surprised that discrete switch point works well here, still, I suggest you to use a continuous step function instead of switch, more details see: https://stackoverflow.com/questions/49144144/convert-numpy-function-to-theano/49152694#49152694
Back to your original question, in pm.sample each sampler has some target that it would try to reach during tuning. For example, in Metropolis, it adjusts the scale of the transition kernel so that the acceptance probability is around 50%
In NUTS,…