Difficulties in using .dist and pm.Potential for inference

You’re right that scan should work fine in principle. The performance is also very very good when scanning over 1d vectors in my testing, so I wouldn’t worry about that. If you play around with it more please report back.

My immediate suspicion is that it’s related to the backward pass. The gradient of scan also involves scanning, so things can get complicated. I work a lot with time series models, and I can usually implement fast and efficient forward passes, but the gradient computations take a long time and end up fragile. No idea if this is true for your case.

1 Like