There are quite a few post about the divergent warning; I would suggest first start with the doc Diagnosing Biased Inference with Divergences. In short, the NUTS sampler in PyMC3 (and Stan, also many other modern Bayesian Samplers) takes advantage of the geometry of the posterior distribution so it can sample from it much more efficiently. @colcarroll’s talk on this is an excellent introduction as well.
The main approach to get rid of the divergence warning is to reparameterize your model, for example, see a case study by @twiecki. Many of the advice in the Stan manual session 26 also apply to PyMC3 especially when NUTS sampler is applied. There are lots of reparameterization tips inside (also see Stan website on this).