Efficient Implementation of Many-to-Many Relationship

Hi all,

I’m having some issues implementing a certain model, and I’d appreciate any input the community can offer. This is somewhat related to a previous question I posed, although this question perhaps gets more to the point.

I have a model (slightly different from the one in the previous question) that I believe is well-suited to the problem at hand. My model involves a many-to-many relationship between two sets of variables. Specifically, I have a set of n parent variables (call them p) and a set of m child variables (call them c), with a many-to-many relationship between them. Each child variable c_{i} is equal to the product of one or more parent variables p_{j}:
50 AM, where pa(c_{i}) denotes the set of parents of c_{i}.

This relationship can also be expressed using an m x n boolean matrix A, where
07 AM
It should be noted that this matrix is quite sparse. There are roughly 5000 parents and roughly 2000 children, with each child having between 1 and 10 parents.

My issue is how to efficiently implement this many-to-many relationship in pymc3/theano

My initial attempt at modeling this problem involved taking the matrix approach, where the i’th child is the product of the parent vector element-wise raised to the power of the i’th row of the matrix A:

parent_indices = {p:i for i,p in enumerate(sorted(parents))} # gives the index of a parent
A = np.zeros((m,n),dtype=np.float32)
for i, pa_ci in enumerate(parent_child_mappings):
    for parent in pa_ci:
        A[i,parent_indices[parent]] = 1 # Mark the parent as being mapped to this child

with pm.Model() as model:
    parents = pm.Uniform("parents", lower=0,upper=1,shape=n,testval=0.5)
    children = pm.math.prod(tt.power(parents,A),axis=1)
    # Rest of the model...

While this approach worked fine on small-scale synthetic data (< 100 parents and children), it is abysmally slow on my actual data. I suspect it has to do with both the sparsity of the parent-child mappings A, as well as the nasty functional form of taking the product of a large vector masked by raising it to the power of the row of a large matrix.

I also implemented this model in Stan, and since Stan allows for loops I took an iterative approach, constructing each child in a loop by indexing into both the parent variable array, and a single long array of the indices of each child’s parents:

data {
    int<lower=0> M; // number of children
    int<lower=0> N; // number of parents
    int<lower=0> K; // Total length of all parent-child lists
    // Indices of each child's parents in array parents,
    // all flattened into one big array
    int<lower=0> parent_child_list[K];  list
    int<lower=0> starts[M]; // Index of beginning of each child's parent list in parent_child_list
    int<lower=0> lengths[M]; // Length of each child's parent list in parent_child_list
parameters {
    real<lower=0, upper=1> parents[N];
transformed parameters {
    real<lower=0, upper=1> children[M];
    for (m in 1:M) {
        // Each child is simply the product of the value of its parents
        real child;
        child = 1.0;
        for (i in 1:lengths[m]) {
            child = child * parents[parent_child_list[starts[m] + i] + 1];
        children[m] = child;
model {
    // Actual model isn't all that relevant here.

This approach in Stan performed similarly to the matrix approach in PyMC3: Worked fine on smallish data, but was prohibitively slow on my actual data. I had hoped that the iterative array-based approach would alleviate some of the sparsity issues that I suspected were slowing the pymc3 implementation down, but the iterative approach in Stan may have incurred some slowdown because of all the looping.

I’m at somewhat of a loss here. Is this just too nasty of a problem to implement in any efficient way? Is there some canonical way of representing these sorts of problems that I’ve somehow missed?

Thanks in advance for any help you can provide, and let me know if anything isn’t clear.

I haven’t looked into this in detail yet, so I might be missing something obvious. But couldn’t you look at this on a log scale and then write this down as a matrix-vector product with a sparse matrix? Theano has some support for those.

Good idea! By using exp(log(x)), the functional form becomes exp(dot(A,log(parents))). By making A sparse, I was able to get the execution time down to less than 10 minutes (much better than the multiple hours it was estimating before!).

Thanks for the tip.