Hi there,
I want to create a transformed parameter using aesara.scan
to avoid writing a double for
loop.
I have a vector \mathbf{y}\in\mathbb{R}^P, and a matrix \mathbf{X}\in\mathbb{R}^{N\times M}, and want to do operations involving the two to return a matrix \mathbf{Z}\in\mathbb{R}^{N\times(M-P)}.
The first column \mathbf{z}_1 should be the dot product of the first P columns of \mathbf{X} with the vector \mathbf{y} applied elementwise: \mathbf{z}_1 = \mathbf{x}_{1:P} \odot \mathbf{y}.
The second column \mathbf{z}_2 is a similar operation but the columns of \mathbf{X} are shifted one to the right: \mathbf{z}_2= \mathbf{x}_{2:(P+1)} \odot \mathbf{y}. Then continue in this manner until all M-P columns are full.
Example numpy
code is:
N = 100
M = 5
P = 2
y = np.array([0.5, -0.2])
X = np.random.normal(size=(N, M))
Z = np.zeros((N, M-P))
for j in range(P, M):
Z[:, j-P] = (y * X[:, (j-P):j]).sum(axis=1)
I have tried to adapt code from here and here for aesara.scan
and (embarrassingly) tried to do the following, for a case where P=2, M =5:
# dot product elementwise to create a single column for Z
data_vec1 = at.vector("data_vec1")
data_vec2 = at.vector("data_vec2")
parameter_vec = at.vector("parameter_vec")
results1, output1 = aesara.scan(lambda x_t1, x_t2: x_t1*parameter_vec[0] + x_t2*parameter_vec[1], sequences=[data_vec1, data_vec2])
compute_elementwise = aesara.function(inputs=[data_vec1, data_vec2, parameter_vec], outputs=results1)
# use previously created columns of Z to iterate over
vec_init1 = at.fvector()
vec_init2 = at.fvector()
input_info = [vec_init1, vec_init2]
outputs_info = dict(initial=input_info, taps=[-2, -1])
results2, updates2 = aesara.scan(fn=compute_elementwise,
outputs_info=outputs_info,
n_steps=3)
sequence_over_columns = aesara.function(inputs=[vec_init1, vec_init2],
outputs_info=outputs_info,
updates=updates2)
The second aesara.scan
and aesara.function
calls error with `‘list’ object has no attribute 'ndim’1, but I don’t think they are doing what I want anyway.
It sort of feels like I am making it much harder than I need to, and that I am looking for a convolve-like operator, but I am a bit stuck in the mud right now, so any help would be appreciated!
Thank you!