I’m a developer for PyLops, a matrix-free linear algebra library. PyLops provides a variety of common linear operators in such a way that they can be used like matrices, but the matrix representation is never calculated. The project was originally inspired from scipy.sparse.linalg.LinearOperator, for those who are familiar with that class.
I’ve been trying to implement a way to wrap PyLops LinearOperators into PyTensor Ops, so that I can do something like A @ x, where A is a PyLops LinearOperator (converted to PyTensor Op) and x is PyTensor tensor, PyMC distribution, etc. We have done something similar for PyTorch and Jax, by implementing a custom torch/jax function which implements forward/backward from the PyLops LinearOperator matvec/rmatvec (forward/transpose functions).
I’ve been working through Creating an Op documentation, and have been struggling to get a specific case - when the shape of the output is different than the shape of the input. I put together a very simple example composed of two ops: the
“forward” consumes a (2,) array and produces another of (10,), and the “backward” consumes a (10,) array and produces another of (2,). All values are set to zeros.
In this setup I’m trying to code the linear operation 0₁₀ₓ₂: R² → R¹⁰. The “forward” is 0₁₀ₓ₂ x₂ₓ₁ = 0₁₀ₓ₁, and the vector-Jacobian product is 0₁₀ₓ₂ᵀ v₁₀ₓ₁ = 0₂ₓ₁₀ v₁₀ₓ₁ = 0₂ₓ₁.
When the dimensions between input and output vectors are different this fails (see gist). It works when they coincide. I have hardcoded dimension sizes in the infer_shapes method, but this doesn’t seem to make a difference.
In make_node you specify: given the inputs what are the output types (shape and dtype). When you say output = y.type(), you’re saying it’s the same type as y. For TensorVariables, this is equivalent to writing output = tensor(shape=y.type.shape, dtype=y.type.dtype), which is not what you want.
If you have a matrix to vector operation, where the dtype is the same as the input(s) you would do something like output = tensor(shape=(None,), dtype=y.type.dtype). For the shape you can leave None if it can be anything (or you don’t know how to predict), or you can put an integer such as output = tensor(shape=(3,), dtype=y.type.dtype). If the shape can be computed from the static shape of the inputs, it may be something like output = tensor(shape=(y.type.shape[-1],), dtype=y.type.dtype)
infer_shape is similar to how you reason about the static shape of the output in make_node, except it can return arbitrary pytensor expressions based on the input shapes and/or values. infer_shape is optional, but it’s useful for when the shape (but not the output) of an Op is needed in a graph, to avoid computing the Op only to figure out its shape.
By the way the project is very interesting! We do have something in this vein for specific cases like det(pt.diag(x)) where we avoid computing the diagonal matrix and the determinant of a fully fledged matrix. Similarly for stuff on Kronecker and Triangular matrices (WIP), where you are almost always better off not materializing the matrices if you can.
import pytensor.tensor as pt
from pytensor.graph import rewrite_graph
x = pt.vector("x")
y = pt.diag(x)
out = pt.linalg.det(y)
out.dprint()
# Blockwise{Det, (m,m)->()} [id A]
# └─ AllocDiag{self.axis1=0, self.axis2=1, self.offset=0} [id B]
# └─ x [id C]
rewrite_graph(out).dprint()
# Prod{axes=None} [id A]
# └─ x [id B]
Obviously PyLops is more comprehensive + cool optimization stuff. Just wanted to highlight the similar philosophy
Amazing, that cleared up so many things for me! See an updated gist for the fix.
With this sorted, I was also able to achieve my goal of wrapping a PyLops LinearOperator as a PyTensor Op. See this notebook. This will probably make its way into a PyLops as both the wrapper and some examples. Thanks again!
Regarding the docs, I had a look at them and they are much clearer now. Another thing which might help make thing clearer is adding type annotations to the methods (e.g., make_node, perform, grad, etc.).