This thread has become quite messy with my many confusions, so that it’s probably difficult to find useful information in it, but I add to my journey into the Bayesian abyss anyway in the hopes it may help somebody else.
In the meantime I found a comment on a blog post about the indexing syntax to which @twiecki replied and explained what the indexing syntax does. Say, you got something like:
with pm.Model() as hierarchical_model:
...
# Intercept for each county, distributed around group mean mu_a
a = pm.Normal('alpha', mu=mu_a, sigma=sigma_a, shape=len(data.county.unique()))
# Expected value
radon_est = a[county_idx]
....
You have
a– a vector with one distribution per county. Your data however, has multiple values per county. So what you want to do is, create a version ofathat has the same length as your data and associates each data point with the right county-distribution.
For example, say we have 2 counties each with 3 data points. Then you havea = [county_1, county_2], to then associate that to your data you could do:a[0, 0, 0, 1, 1, 1]to get the right vector of length 6 that syncs up correctly to your data.