I’m having trouble defining an unpooled Dirichlet process model (stick breaking). Everything I’ve tried gives an error complaining that there are inconsistent dimensions. Below is some example code I pulled from Bayesian Analysis with Python (3rd Ed.), but I’ve added a coord
dictionary for weekdays. How can I add dims
arguments to the model definition so that I can get an unpooled model for each day in the dataset? I suspect the shape
arguments aren’t playing nice when I add dims
arguments. Perhaps I need to omit dims
arguments and make the shape
arguments multidimensional?
K = 10
def stick_breaking(α, K):
β = pm.Beta('β', 1., α, shape=K)
w = β * pt.concatenate([[1.], pt.extra_ops.cumprod(1. - β)[:-1]]) + 1E-6
return w/w.sum()
idx = pd.Categorical(tips["day"], categories=["Thurs", "Fri", "Sat", "Sun"]).codes
coords = {"days": categories, "days_flat": categories[idx]}
with pm.Model(coords=coords) as model_DP:
α = pm.Gamma('α', 2, 1)
w = pm.Deterministic('w', stick_breaking(α, K))
means = pm.Normal('means',
mu=np.linspace(cs_exp.min(), cs_exp.max(), K),
sigma=5, shape=K,
transform=pm.distributions.transforms.univariate_ordered,
)
sd = pm.HalfNormal('sd', sigma=5, shape=K)
obs = pm.NormalMixture('obs', w, means, sigma=sd, observed=cs_exp.values)
idata = pm.sample(random_seed=123, target_accept=0.9)