Dot product of a slice of a 3-dimensional prior and data - Multi-Output-Linear-Regression

I am trying to perform a linear regression with 10 independent and 3 dependent variables. For every sample, the sum of independent variables equals one and so does the sum of the dependent variables. Data is divided into 5 groups. I am trying to build a no-pooling model with independent inference for each group. While I can build individual models for each group, I thought of following the method of group_indexes in the likelihood function so that I can further build hyper-priors and test a hierarchical model. The coefficients of linear regressions are also bounded to [0, 1] so a Dirichlet prior was the option. A Dirichlet prior also lets me simultaneously model the 3 dependent variables that add up to one (this is feasible as my independent variables also add up to one) My model looks like this.

from scipy import stats
import numpy as np
import pymc3 as pm

n_features = 10
dependent_count = 3
group_count  = 5

x = stats.dirichlet(a=np.ones(n_features)).rvs(1000)
y = stats.dirichlet(a=np.ones(dependent_count)).rvs(1000)

idx = np.random.randint(group_count, size=1000) # sample group numbers

with pm.Model() as md:
    # prior
    α = pm.Dirichlet('α', a=[1]*dependent_count,
                     shape=(5, 10, 3))
# taking a dot product of α for each group with the data
    t = []
    for i in range(x.shape[0]):
        t.append([i], α[idx[i]]))

    t_ = tt.stacklists(t)

    ϵ = pm.HalfCauchy('ϵ', 2, shape=(5,3))
    y_pred = pm.Normal('y_pred', mu=t_ ,sd=ϵ[idx], observed=y)
    trace_h = pm.sample(1000, chains=2)

Now this takes forever to run and is stuck at “Initializing NUTS using jitter+adapt_diag”. I’m assuming the way in which I’m creating t_ is wrong. When passing a numpy array, I got an error : ValueError: setting an array element with a sequence. Please keep in mind that the dot product is of the coefficients and independent variables for each group hence the for loop.

You can replace the loop by doing the dot-product “by hand”, multiplying the coefs with the data then summing away the target dimension:

    t_ = (x[..., None] * α[idx]).sum(axis=1)

I expanded x to be (1000, 10, 1) so that it would broadcast with α[idx] (shape (1000, 10, 3)). You can confirm the approaches are equivalent by checking some of the rows of the looping method.

1 Like

Thanks for the quick help! I’m assuming that I would have to follow a similar approach when I’m building a hierarchical version of the model and having multiple group columns.

You’ll have to reason about exactly what you want the math to be doing, but this “align, multiply, sum” recipe lets you do dot-products between conformable dimensions of arbitrary nd-arrays, yes. You’re basically doing np.einsum.

1 Like