Use of coordinates/dimensions in hierarchical models

Im looking for general guidance on how dimensions interact with hierarchical models.

Ive been using A Primer on Bayesian Methods for Multilevel Modeling — PyMC example gallery as a guide to turn what is essentially regression into a hierarchical version, together with coordinates/dimensions and Im struggling with the shapes of operations

I have a small data set, consisting of 8 groups, with c. 73k observations, so around 9k obs per group, and multiple predictors. Previously (non-hierarchical) I used syntax like:

μ = pm.math.dot(α_shr, α) + …

for simplicity assume independent Normals, eg

α = pm.Normal(‘α’, mu = 0.0, sigma = 10.0, shape=len(hd_vars)

Now that Ive specified groups (dims), the shapes are no longer compatible with ‘dot’ as it stands, as the shape is now: len(hd_vars) * 8

The above reference uses syntax like: theta = a_county[county_idx] + b_county[county_idx] * floor_idx

Please could you help me understand the correct way to generalise the dot-product type op to any number of predictors & hierarchy groups for calculating α?

Welcome, @DaveD ! Good idea to follow examples: they are useful, but as you encountered, of limited scope.

The “shape” of variables is a typical complication in PyMC, but once you get a hang of it, there are what gives the creative options for model creation.
I tried to recover my own examples for it, and one is here (but note that this was an older version of PyMC).

Some pointers:
I tend to use the “dimensions”/“coordinates” (what you named “hd_vars”?) into the second dimension. For example, try
α = pm.Normal(‘α’, mu = 0.0, sigma = 10.0, shape=(1, len(hd_vars)))

Then, I tend to avoid the * operator, since it is so overloaded: try
aesara.tensor.dot(data[idx], slope)
so the outcome for each summand of your estimator will be shape: (data.shape[0], len(hd_vars)).

Thirdly, on some model components (intercepts, residual) you will have to use a column vector of ones to make them match that shape.
fid = pymc.Data('fake_intercept_data', numpy.ones((data.shape[0],1)), mutable = True)
intercept = aesara.tensor.dot(fid, pymc.Normal('intercept', ..., shape = (1, len(vars))))

Hope this helps!
Cheers

Hi @falk, many thanks for your examples, really interesting, especially the model component function!

I notice youve used the ‘idx’ suffix on your data rather than the parameters - Im assuming ‘idx’ here is a group identifier?

In my non-hierarchical version (where I would have 8 unpooled models) I use:
μ = pm.math.dot(α_shr, α)
where α_shr is of shape (9000, 10), for 9k obs in the first group (for e.g.) and 10 predictors, and α is of shape (10, )

However, in the hierarchical version, in my head at least, the 8 groups of observations are now stacked, so α_shr is of shape (73411, 10) and α is of shape (8, 10), which is obviously incompatible as it stands. Ive tried transposing α but the result is then (73411, 8), which isnt correct either.

I was hoping to use a ‘group index’ in the dot product in order to avoid iterating through each of the 8 sub-groups, but I cant seem to get the syntax right.

Im hoping there is an easy solution to this! Thank you in advance :smiley: