I’ve found this question with similar symptoms and it has worked for me, init='advi'
>>> from theano.printing import Print as tt_print
>>> with pm.Model() as model:
... mu = pm.Normal('mu', mu=0, sigma=1)
... # mu_p = tt_print('mu')(mu)
... obs = pm.Normal('obs', mu=mu, sigma=1, observed=np.random.randn(100))
... trace = pm.sample(1000,tune=500, init="advi")
...
Auto-assigning NUTS sampler...
Initializing NUTS using advi...
Average Loss = 142.29: 4%|██▉ | 8299/200000 [00:02<01:03, 3018.93it/s]
Convergence achieved at 8300
Interrupted at 8,299 [4%]: Average Loss = 148.34
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [mu]
Sampling 2 chains, 0 divergences: 35%|█████████████████████▌ | 1043/3000 [02:15<05:29, 5.94draws/s]