Using aesara.scan to "double loop" a vector over an matrix

conorhassan · September 3, 2022, 11:59pm

Hi there,

I want to create a transformed parameter using aesara.scan to avoid writing a double for loop.

I have a vector \mathbf{y}\in\mathbb{R}^P, and a matrix \mathbf{X}\in\mathbb{R}^{N\times M}, and want to do operations involving the two to return a matrix \mathbf{Z}\in\mathbb{R}^{N\times(M-P)}.

The first column \mathbf{z}_1 should be the dot product of the first P columns of \mathbf{X} with the vector \mathbf{y} applied elementwise: \mathbf{z}_1 = \mathbf{x}_{1:P} \odot \mathbf{y}.

The second column \mathbf{z}_2 is a similar operation but the columns of \mathbf{X} are shifted one to the right: \mathbf{z}_2= \mathbf{x}_{2:(P+1)} \odot \mathbf{y}. Then continue in this manner until all M-P columns are full.

Example numpy code is:

N = 100
M = 5
P = 2 
y = np.array([0.5, -0.2])
X = np.random.normal(size=(N, M))
Z = np.zeros((N, M-P))
for j in range(P, M):
    Z[:, j-P] = (y * X[:, (j-P):j]).sum(axis=1)

I have tried to adapt code from here and here for aesara.scan and (embarrassingly) tried to do the following, for a case where P=2, M =5:

# dot product elementwise to create a single column for Z
data_vec1 = at.vector("data_vec1")
data_vec2 = at.vector("data_vec2")
parameter_vec = at.vector("parameter_vec")

results1, output1 = aesara.scan(lambda x_t1, x_t2: x_t1*parameter_vec[0] + x_t2*parameter_vec[1], sequences=[data_vec1, data_vec2])
compute_elementwise = aesara.function(inputs=[data_vec1, data_vec2, parameter_vec], outputs=results1)

# use previously created columns of Z to iterate over 
vec_init1 = at.fvector()
vec_init2 = at.fvector()

input_info = [vec_init1, vec_init2]

outputs_info = dict(initial=input_info, taps=[-2, -1])

results2, updates2 = aesara.scan(fn=compute_elementwise, 
                                 outputs_info=outputs_info,
                                 n_steps=3) 

sequence_over_columns = aesara.function(inputs=[vec_init1, vec_init2], 
                                        outputs_info=outputs_info, 
                                        updates=updates2)

The second aesara.scan and aesara.function calls error with `‘list’ object has no attribute 'ndim’1, but I don’t think they are doing what I want anyway.

It sort of feels like I am making it much harder than I need to, and that I am looking for a convolve-like operator, but I am a bit stuck in the mud right now, so any help would be appreciated!

Thank you!

ricardoV94 · September 4, 2022, 6:56am

CC @jessegrabowski

jessegrabowski · September 4, 2022, 7:48am

I think you’re over-engineering the problem a bit. Sticking to your example numpy code, what you really want is a a function with inputs X, y, M, P, that loops only over the range P to M, and does a specific computation.

A nice trick when using scan is to use aesara.tensor.arange as the sequence argument, and provide the list of objects to be indexed as non-sequences. This will allow you to collapse the two scans you tried into a single scan. I wrote your numpy example in aesara like this:

y_aes = at.dvector('y')
X_aes = at.dmatrix('X')
P_aes = at.iscalar('P')
M_aes = at.iscalar('M')

results, _ = aesara.scan(lambda j, y, X, P: (y * X[:, (j-P):j]).sum(axis=1),
                         sequences=at.arange(P_aes, M_aes),
                         non_sequences=[y_aes, X_aes, P])

Z_aes = at.concatenate([results], axis=0).T

The main scan is exactly the for-loop in your example code. The only line that I think needs a bit of contemplation is Z_aes = at.concatenate([results], axis=0).T. The lambda function in the scan (which I stole from your code) outputs a single column of computation. Here we are indexing X in such a way that we will get vectors out, so the result object is a list of M-P 1d vectors. When you call concatenate on them, you’ll get back a M-P x N matrix, so you need to transpose it back to N x M-P.

Alternatively, you could try to fuss with shapes in the scan code, so that the lambda function returned N x 1 column vectors.

Verify this gives the right answer:

f_Z = aesara.function([y_aes, X_aes, P_aes, M_aes], [Z_aes])
np.allclose(f_Z(y, X, P, M)[0], Z)
>>> Out: True

Finally, if your ultimate goal is to use these computations in a PyMC model, be aware that you should not compile and evaluate the function using aesara.function. I did that just as a check to show it’s giving the right answer.

Topic		Replies	Views
Aesara outer product more than two inputs v5 aesara	1	591	July 21, 2022
Cannot convert to variable (Aesara scan) Questions	1	1429	April 25, 2021
Using aesara.scan to compute the linear predictor involving a B-spline matrix v5	8	714	October 19, 2022
Using aesara tensor as indices modeling , aesara	3	458	December 28, 2022
Aesera Scan Altering Distributions v5 bug , aesara	1	505	December 8, 2022

Using aesara.scan to "double loop" a vector over an matrix

Related topics