Hi everyone,
I’m Alessandro, a Master’s student in Artificial Intelligence at the University of Bologna, holding a Bachelor’s in Mathematics. I am writing to express my strong interest in the “Linear Algebra Rewrites” project for GSoC 2026.
I’ve spent the last couple of weeks diving deep into the project’s context. Specifically, I have been:
-
Studying the COLA (Compositional Linear Algebra) paper and exploring the repository to understand the framework for structured linear operators.
-
Reviewing previous work and discussions within PyTensor toward this direction.
-
Going through the documentation, existing notebooks, and technical videos to understand how graph rewritings are currently handled in the codebase.
To get familiar with the library’s internals, I have already opened a few PRs related to graph rewritings:
My short term goal is to continue contributing and opening PRs “here and there” to master the codebase and ensure I’m ready for the technical challenges of the project.
I have a couple of questions as I begin drafting my proposal:
-
Are there specific areas of the PyTensor graph engine you recommend I focus on to better tackle the issue?
-
Do you have any suggestions on how to get deeper into the subject to ensure my proposal aligns with the long-term vision of the library?
Looking forward to your feedback!
Best,
Alessandro
Hi @jessegrabowski,
I have started writing down a first draft for the proposal here. I would appreciate your feedback.
Thanks,
Alessandro
Hey Alessandro
Nice to see your proposal is underway. I don’t think you need notebooks galore for this projects, since it’s more an “in the weeds” topic. If you wanted something more presentable, it would be good to think about some linear algebra heavy algorithms (kalman filter, newton’s method, kronecker-structred GP, LoRA, natural gradients, …?) that could directly benefit from these rewrites. Then in addition to tracking the project in a piecewise fashion (via tests and scratch scripts/notebooks that compare before/after graphs), you can also see the rewrites in a “real use case”.
Not necessary, but something to think about if you’re more interested in one of these applied topics than just the nitty-gritty itself. And even if you’re interested in the nitty-gritty as such, it would be good to have a motivating example to organize your work around to show it off in future.
Thanks for the reply!
I have started exploring PyMC use cases (the ones you suggested) for my proposed linear algebra rewrites in PyTensor.
- Here is the google doc i am working on.
- Here is the notebook for Kronecker structured gaussian processes specifically.
Tell me if it sounds to you. Then I will go for a new draft of the proposal.
Thanks,
Alessandro
Hi Alessandro
The notebook is fine, looks like it’s mostly llm generated. Certainly it gives you a good view into the types of computational tricks that we’re talking about. I strongly encourage you to get into the mud and start discovering how pytensor actually works with you “bare hands”. You might want to check what pymc is already doing in the gp.LatentKron and gp.MarginalKron functions too, to see the state of things.
Now that you see the two ways to compute a kronecker, can you use graph_replace to swap out a graph with y = solve(kron(a, b), x) with y = kron(solve(a, x[:a.shape[1]]), solve(b, x[a.shape:])) ? Think also about how you could detect these patterns in a graph to do it automatically for a user. There are already some kronecker-related rewrites in here that can give you some ideas.
Hi Jesse,
Regarding the graph_replace challenge: y = kron(solve(a, x_slice1), solve(b, x_slice2)).
I’ve analyzed this formulation and I believe it is a wrong application of the inverse property of the Kronecker product:
(A \otimes B)^{-1} = A^{-1} \otimes B^{-1}
In fact, applying this to a system (A \otimes B)y = x as kron(solve(a, x[:a.shape[1]]), solve(b, x[a.shape[1]:])) is false in general (easy to spot if you check dimensional compatibility). Instead, by exploiting this property we can obtain this formula:
(A \otimes B) \operatorname{vec}(X) = \operatorname{vec}(B X A^T)
For row-major flattening (the default behavior for reshape and ravel) and by the above mentioned inverse property, the correct factorized replacement that I have implemented in this new notebook is the following:
y = \text{solve}(A \otimes B, x) \iff Y = \text{solve}(B, \text{solve}(A, X)^T)^T
where X = \operatorname{reshape}(x, (n_A, n_B)) and y = \operatorname{ravel}(Y).
While diving into the internals of LatentKron and MarginalKron, I noticed that several of the identities I’ve proposed are already being ‘manually’ exploited within the PyMC GP module.
This raises a strategic question for my proposal: Should I focus on migrating these existing ‘hands-on’ implementations into formal PyTensor graph rewrites? This would generalize the optimizations for all computational graphs (not just GPs) and allow us to refactor/simplify the current PyMC code. Or, would you prefer the project to focus strictly on ‘unlocking’ new linear algebra rewrites for applications that aren’t currently optimized in PyMC at all?