You can set the init='advi' in pm.sample()
In theory, there should be no data coverage problem - you are sampling from the posterior distribution (conditioned on your model and data). If you have too few data your prior just kind of take over and there are lots of uncertainty in your estimation. What the problem is more if you have complex geometry and the sampler cannot effectively take samples in a finite time.