# `stack`ing and `sum`ing a list of random variables ruins sampling?

Hello all,

I feel like this may be a problem of simple misunderstanding… I have a model in which I sum a list of slope terms (`slopes` in the code below). If I have two slope terms and compute the predicted mean like

``````mu = pm.Deterministic("mu", a[o] + slopes[0] + slopes[1])
``````

then everything is fine.

However, if I try to cover the general case of having an arbitrary number of slope terms, like this

``````mu = pm.Deterministic("mu", a[o] + pm.math.sum(pm.math.stack(slopes)))
``````

then the model takes forever to sample and generates enormous variation on the output parameters. (It samples, though, so I believe that the tensor shapes are fine…)

What am I missing about the behaviour of the `stack` and `sum`?

Cheers and thanks,
Eberhard

Model

``````    # shortcuts (there are three categories, indexed as 0, 1, 2)
o = df["category_idx"].values

with pm.Model() as m:
# Global hyperprior for mean
bar_a = pm.Normal("bar_a", mu=0, sigma=params.sigma_bar_a)

# Variation around mean per category, centered prior
z_a = pm.Normal("z_a", mu=0, sigma=1, shape=3)
a = pm.Deterministic("a", bar_a + z_a * params.sigma_a)

# containers for slope terms
bar_bs = {}
z_bs_o = {}
bs_o = {}
slopes = []

# slope terms
for var in params.vars:
# Global hyperprior for slope
bar_bs[var.abbr] = pm.Normal(
f"bar_b_{var.abbr}", mu=0, sigma=var.sigma_bar_b
)

# Slope per category; centered prior
z_bs_o[var.abbr] = pm.Normal(f"z_b_{var.abbr}_o", mu=0, sigma=1, shape=3)
bs_o[var.abbr] = pm.Deterministic(
f"b_{var.abbr}_o",
bar_bs[var.abbr] + var.sigma_b * z_bs_o[var.abbr],
)

# Compute contribution to mean estimate
slopes.append(
pm.Deterministic(
f"slope_{var.abbr}", bs_o[var.abbr][o] * df[var.var_name].values
)
)

# Mean predicted value for relative revenue

# This produces strange results (stacking and summing along axis=1 doesn't help)
# mu = pm.Deterministic("mu", a[o] + pm.math.sum(pm.math.stack(slopes)))

# This works fine, for the case of two slopes
mu = pm.Deterministic("mu", a[o] + slopes[0] + slopes[1])

# StdDev of residual variation amongst entities, per category
sigma = pm.Exponential(
"sigma",
params.lbda,
shape=3,
)

# Variation of data around mean predicted value
pm.StudentT(
"target",
nu=params.nu,
sigma=sigma[o],
mu=mu,
observed=df["target"].values,
)
``````

`params` is a paramter object defined like this:

``````@dataclasses.dataclass(eq=True, frozen=True)
class MultiFlexVar:
"""Parameters for one regression variable in MultiFlex model"""

abbr: str  # abbreviation for variable names
var_name: str  # data field to regress against
sigma_bar_b: float  # stdev of hyperprior for mean slope
sigma_b: float  # variation of slope per category
typical_range: Tuple[float, float]  # typical value range for prior plots
default_value: float  # default value for counterfactual plots

@dataclasses.dataclass(eq=True, frozen=True)
class MultiFlexParams:
"""Parameters for flexible multiparametric model"""

vars: Tuple[MultiFlexVar, ...]  # variables to regress against

sigma_bar_a: float = 0.02  # prior stdev for global mean parameter `bar_a`
sigma_a: float = (
0.04  # prior stdev for variation of per-category mean around `bar_a`
)
lbda: float = 20  # decay constant for prior of noise `sigma`
nu: float = 2  # shape parameter for StudentT distribution

tune: int = 2000  # tuning steps
draws: int = 4000  # sampling draws
target_accept: float = 0.9  # sampling tuning parameter

params = MultiFlexParams(
vars=(
MultiFlexVar(
abbr="a",
var_name="column_a",
sigma_bar_b=0.1,
sigma_b=0.1,
typical_range=(0, 0.1),
default_value=0.03,
),
MultiFlexVar(
abbr="b",
var_name="column_b",
sigma_bar_b=0.01,
sigma_b=0.01,
typical_range=(0, 1),
default_value=0.3,
),
)
)
``````

What shape is `slopes`? It’s not immediately clear to me why you are stacking before summing.

`slopes` is a list of linear terms like `b * var`, where `b` is the slope that I want to determine and `var` is one of the input variables (columns in the DataFrame). I want to be able to define an arbitrary list of variables to include in these linear terms, so I’m collecting them all in the list `slopes`.

Since each of these linear terms is a 1D tensor (length equalling the length of my input DataFrame), I want to stack them 'side-by-side` and sum across the linear terms, to again get a 1D tensor with length equalling that of my input DataFrame.

Is there a simpler way to do that?

Ah, then I suspect that it may be the list that is slowing things up here. In general, it would be more efficient to keep everything as tensors. So if you have a set of N columns in your data (each representing a `var`), then you can define N parameters like this (instead of the loop you are currently using):

``````coefs = pm.Normal('coefs', mu=0, sigma=`, shape=N)
``````

Or you can use the new dims/coords feature, which would allow you to “name” each parameter as you like. Either way, your sum should be something like this:

``````sums = pm.math.dot(coefs, data_columns.T)
``````

Thanks for your help - yes, I could probably rewrite that as a `.dot` of a parameter tensor with all my data columns.

I was under the impression that the `pm.math.stack` would somehow transform my list of 1D tensors into a 2D tensor (so that it wouldn’t matter any more that it had been a list). But apparently it’s not that easy.

Thanks a lot!