How to vectorize indexing of variable for pm.math.sum?

AlexAndorra · April 11, 2020, 3:10pm

Hi all!
I’m working on a model with an ordered categorical predictor (E below). So, the associated parameter is incrementally added to the linear model: when E==0, delta_e = 0; when E==1, delta_e = 0 + delta_e1; when E==2, delta_e = 0 + delta_e1 + delta_e2, etc. In code:

with pm.Model() as m:
    kappa = pm.Normal(
        'kappa', 0., 1.5,
        transform=pm.distributions.transforms.ordered,
        shape=6, testval=np.arange(6) - 2.5)
    bA = pm.Normal('bA', 0., 1.)
    bE = pm.Normal('bE', 0., 1.)
    
    delta = pm.Dirichlet("delta", np.repeat(2., 7), shape=7)
    delta_j = tt.concatenate([tt.zeros(1), delta])
    
    phi = bE * pm.math.sum(delta_j[: E]) + bA * A

    resp_obs = pm.OrderedLogistic(
        'resp_obs', phi, kappa,
        observed=R
    )

This however yields a ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().
A workaround is to use a list comprehension:

delta_sum = tt.as_tensor_variable([pm.math.sum(delta_j[: E[i] + 1]) for i in range(len(E))])
phi = bE * delta_sum + bA * A

But there are about 10_000 data points, so this yields an Exception: ('Compilation failed (return status=1).
Does someone see how to vectorize that operation (or another workaround)?

Thanks a lot in advance, and stay safe!

junpenglao · April 12, 2020, 7:33am

Try computing the cumsum first, and index to the resulting cumsum matrix.

AlexAndorra · April 13, 2020, 9:51am

Thanks Junpeng!
Not sure I understand though: isn’t that what I’m doing with:

delta_sum = tt.as_tensor_variable([pm.math.sum(delta_j[: E[i] + 1]) for i in range(len(E))])
phi = bE * delta_sum + bA * A

?

nkaimcaudle · April 13, 2020, 1:12pm

Would this work?

delta = pm.Dirichlet("delta", np.repeat(2., 7), shape=7)
delta_j = tt.concatenate([tt.zeros(1), delta])
delta_j_cumulative = tt.cumsum(delta_j)

phi = bE * delta_j_cumulative[E+1] + bA * A

AlexAndorra · April 13, 2020, 2:01pm

Oh, you’re right, that’s the meaning of Junpeng’s answer
That works, thanks Nicholas

Topic		Replies	Views
Problem with categorical index variable in v5 v5	1	296	January 3, 2024
Constrain the sum of two categorical variables Questions modeling	3	776	January 24, 2022
`stack`ing and `sum`ing a list of random variables ruins sampling? Questions	4	588	May 13, 2022
Sum distributions based on another distribution Questions	1	402	September 2, 2021
API question: What are the semantics of array-style indexing on a Distribution object, with another PyMC object? version agnostic doc	2	353	May 9, 2022

How to vectorize indexing of variable for pm.math.sum?

Related topics