You can find the pymc3 notebook discussing the diagnosis of divergences here. It’s slightly out of date, and so many of the functions in that notebook have been (or are in the midst of being) moved into arviz. But that notebook should nonetheless give you good guidance about the general procedure.
Indeed. Ultimately, the divergences are simply an indication of what was going on during sampling. Divergences indicate that the sampler was having trouble and figuring out where in the parameter space the sampling was difficult (i.e., where the divergences were observed) is a major part of uncovering the relevant investigation. The notebook above specifically addresses this and presents a re-specified model that alleviates the uncovered sampling difficulties. Of course, you can get a stray divergence here or there for no obvious reason. But until you go digging, you’ll never know why.