I have a small data set, consisting of 8 groups, with c. 73k observations, so around 9k obs per group, and multiple predictors. Previously (non-hierarchical) I used syntax like:
μ = pm.math.dot(α_shr, α) + …
for simplicity assume independent Normals, eg
α = pm.Normal(‘α’, mu = 0.0, sigma = 10.0, shape=len(hd_vars)
Now that Ive specified groups (dims), the shapes are no longer compatible with ‘dot’ as it stands, as the shape is now: len(hd_vars) * 8
Please could you help me understand the correct way to generalise the dot-product type op to any number of predictors & hierarchy groups for calculating α?
Welcome, @DaveD ! Good idea to follow examples: they are useful, but as you encountered, of limited scope.
The “shape” of variables is a typical complication in PyMC, but once you get a hang of it, there are what gives the creative options for model creation.
I tried to recover my own examples for it, and one is here (but note that this was an older version of PyMC).
Some pointers:
I tend to use the “dimensions”/“coordinates” (what you named “hd_vars”?) into the second dimension. For example, try α = pm.Normal(‘α’, mu = 0.0, sigma = 10.0, shape=(1, len(hd_vars)))
Then, I tend to avoid the * operator, since it is so overloaded: try aesara.tensor.dot(data[idx], slope)
so the outcome for each summand of your estimator will be shape: (data.shape[0], len(hd_vars)).
Thirdly, on some model components (intercepts, residual) you will have to use a column vector of ones to make them match that shape. fid = pymc.Data('fake_intercept_data', numpy.ones((data.shape[0],1)), mutable = True) intercept = aesara.tensor.dot(fid, pymc.Normal('intercept', ..., shape = (1, len(vars))))
Hi @falk, many thanks for your examples, really interesting, especially the model component function!
I notice youve used the ‘idx’ suffix on your data rather than the parameters - Im assuming ‘idx’ here is a group identifier?
In my non-hierarchical version (where I would have 8 unpooled models) I use:
μ = pm.math.dot(α_shr, α)
where α_shr is of shape (9000, 10), for 9k obs in the first group (for e.g.) and 10 predictors, and α is of shape (10, )
However, in the hierarchical version, in my head at least, the 8 groups of observations are now stacked, so α_shr is of shape (73411, 10) and α is of shape (8, 10), which is obviously incompatible as it stands. Ive tried transposing α but the result is then (73411, 8), which isnt correct either.
I was hoping to use a ‘group index’ in the dot product in order to avoid iterating through each of the 8 sub-groups, but I cant seem to get the syntax right.
Im hoping there is an easy solution to this! Thank you in advance