Update: Looking at this post, I got it to run on Windows WSL (Ubuntu) with
with pm.Model() as model:
gamma = pm.Gamma("gamma", alpha=1., beta=1., shape=K)
theta = pm.Dirichlet("theta", a=gamma, shape=K, observed=x)
trace = pm.sample(draws=1000, cores=2, init="adapt_diag", tune=1000)
Everything seems to work fine for the first 2000 iterations (first half), but for the second half I get only divergences and the output reads The chain contains only diverging samples. The model is probably misspecified
.