for a model I am writing I am trying to implement a loop over the last dimension of a multi-dimensional array A_array (three dimensions).
It is a differential evolution, where the next step always depends on the previous one. Since there is no simple analytical version in my case I am trying to do it like this (simplified model):

When I run this loop in numpy it takes about 0.06 seconds, but when A_array is a pytensor.tensor it takes ~ 100 times longer. Why is that?

I have tried to use PyTensor.scan, but b and c are pymc random variables, and somehow my PyTensor.scan implementation did not like that.
Any help of how to speed this loop up or properly write it in PyTensor would be highly appreciated!

Looping over pytensor variables takes a long time because itâ€™s not doing what you think itâ€™s doing. Rather than doing a recursive computation, itâ€™s iteratively constructing a computational graph, polluting it with new unnecessary variables and operations at every iteration. You should not use loops in pytensor to do recursive computation.

You should be able to use random variables with scan. What did you try?

Thank you for the quick reply!
That makes a lot of sense. The scan implementation I used was with an older version of the model (just one dimensional) and I used the scan(fn: lambda (myfunction), sequences= i ).
But then I got the error, that one of my variable was a random variable and scan couldnâ€™t handle that.
Now I donâ€™t understand the syntax of scan good enough to implement the above code snipped in a useful way. For example, how do I tell scan to only loop over one dimension of a PyTensor tensor?

Scan always loops over the first dimension of stuff passed to the sequences argument, so you could permute the axes to put the â€śbatchâ€ť dimension first. Another strategy would be to give a pt.arange as the sequence argument and the tensors as non_sequences, then index the tensors in the inner function. Hereâ€™s an example of the 2nd strategy, hereâ€™s an example of the 1st.

@jessegrabowski:
The issue I seem to have with both version is that A_new[i+1] is informed by the previous step A_new[i]. In my example above I used at.set_subtensor(), but how does that work together with scan?

You can use set_subtensor inside the scan function, thatâ€™s no problem at all. To recursively pass a variable into scan, use the outputs_info argument. Maybe it will look something like this (untested code):

I tried your suggestion, but the result has 4 dimensions (an additional one for i).
Based on your previous suggestions and this example I used the following:

Maybe there is a smarter way for this last step, but it runs as supposed now and is much faster! It now takes about 0.2 seconds for a simulation.
Thank you very much for your help @jessegrabowski.

You need to add the initial state yourself, as you discovered. You could just pt.concatenate the result with A_sim[:, :, 0] though; no need to create a new array and fill it. Also you donâ€™t need to evaluate at.shape(A_array), itâ€™s fine to give symbolic tensors as shapes.

Something like that should work. By default I think it would be on the first axis (axis=0) though, because scan goes over the first axis. Youâ€™d have to check the shapes.

Hi @jessegrabowski:
one additional question came up after using this for a while now:
Is there a way to have two matrices in outputs_info?
What is the syntax for that, since there pytensor.scan only outputs result, updates ?

You can have your inner function return a tuple for results. For example:

A = pt.dmatrix('A')
B = pt.dmatrix('B')
C = pt.dmatrix('C')
def step(A, B, C):
return A @ C, B @ C
results, updates = pytensor.scan(step, outputs_info=[A, B], non_sequences=[C], n_steps=10)

results will be a list of 2 (10, None, None) symbolic tensors.