Can Hamiltonian MCMC be used effectively if the likelihood is identically zero for some parameters?

You could try adding a penalty term via a pm.Potential? Something like 1/(softplus(c + b(Theta - T)). When Theta >> T, this converges to zero, but when Theta is close to T, you can get an arbitrarily large likelihood penalty. c and b are free parameters to help control the size of the penalty term at zero (c) and the speed at which it crashes out to zero (b).

The advantage of something like this over something like ell = pt.switch(pt.ge(theta, T), ell, 0) is that it’s differentiable, so it might end up with better sampling? I’d try the switch too (assuming you haven’t).