I would like to model a linear regression problem, in which the input random variables’ relations are encoded in a tree structure.
More specifically, each tree node corresponds to an input random variable. For each non-root node n, denote the corresponding input variable as X_n. Further, denote n's parent node as p(n) and the corresponding variable as X_{p(n)}. Then we have the following relations:
- Each input random variable follows a normal distribution, whose mean follows another normal distribution defined by its parent.
- The random variable corresponding to tree root follows is normally distributed with a fixed mean.
In the case of a binary tree of 3 levels (with 7 variables), the model looks like this:
Finally, for simplicity, all normal distributions in this example have the same \sigma.
Below is a MWE, in which a binary tree is used.
- You can specify the number of levels by setting
n_levels
. - The tree structure is encoded in a “parent array”, where the i-th element contains the index of parent of the i-th node.
import numpy as np
import pymc as pm
def build_parent_array(n_levels: int) -> np.ndarray:
"""
utility function to build the parent array for a binary tree of n levels
examples
>> build_parent_array(1) # [-1]
>> build_parent_array(2) # [-1, 0, 0]
>> build_parent_array(3) # [-1, 0, 0, 1, 1, 2, 2]
"""
assert n_levels >= 1
parent_mapping = [-1]
cur = 0
for l in range(1, n_levels):
for par in range(2 ** (l - 1)):
parent_mapping += [cur, cur]
cur += 1
return np.array(parent_mapping, dtype=int)
n_levels = 7 # you may decrease this value to make the code run
parent_array = build_parent_array(n_levels)
print(f"number of variables: {len(parent_array)}")
SIGMA = 10
with pm.Model() as model:
input_vars = []
for var_id, parent in enumerate(parent_array):
if parent == -1:
input_vars.append(
pm.Normal("var-" + str(var_id), mu=0, sigma=SIGMA)
)
else:
input_vars.append(
pm.Normal(
"var-" + str(var_id), mu=input_vars[parent], sigma=SIGMA
)
)
inputs = pm.Deterministic("inputs", pm.math.stack(input_vars))
pm.sample()
However, I observe that the code does not “scale” with the tree size.
For instance setting n_levels
beyond 6 would give the following error.
fatal error: bracket nesting level exceeded maximum of 256
My questions are:
- If we restrict to the above modeling, how can I circumvent the maximum nesting level limit?
- Is there any modeling approach to achieve a similar effect, i.e., a variable prior depends on its parent, and does not suffer from the above issue?