Why does Op.L_op need `outputs` argument?

Why does Op.L_op(inputs, outputs, output_gradients) need outputs argument?

Here is the method signature description Creating a new Op: Python implementation — PyTensor dev documentation . This seemingly older page Ops — PyTensor dev documentation does not have this argument in the analogous method Op.grad(inputs, output_gradients) . I looked into PyTensor code and found no essential use of outputs (maybe once I saw a reference to its type or other metadata, maybe not). This leaves the question: what is the reason for including outputs? What is its semantics? Is this for optimisation? If computation of gradient requires to evaluate the function value again (as in \mathrm{d}\,e^{x^2}), would not graph re-writer automatically merge the nodes or something?

(By the way, in case of a custom Op, can it be self-referential? That is, is there a way to refer to MyOp from inside MyOp.L_op()? Yes, of course, as shown in the first link: outputs = self(inputs))

It’s just an eager deduplication optimization like you inferred.

Some Ops, like Scan/OpFromGraph aren’t easy to de-duplicate.

Sometimes it’s also convenient to know the dtype of the outputs, and creating a new node just for that seems silly, since the method that called L_op had it in hand just before calling it.

But that’s about it. It’s also inconsistent that R_op doesn’t get the same inputs IIRC.

Otherwise grad is not favored because it wasn’t a good name. The method returns the Vector-Jacobian-Product (left multiply with the Jacobian), not a gradient.

1 Like