Newton raphson solver using pt.scan

Hi all! I am new to using this library and was wondering if I could get a little bit of help. I am trying to solve a differential equatio using Newton Raphson. I’m attaching the simplified version below:

def newton_raphson(i,n):
    F = func(n)
    J = pt.jacobian(F, n)
    n_new = n - pt.dot(pt.linalg.inv(J), F)
    return n_new


n_out, _ = pytensor.scan(fn=newton_raphson,
                         sequences=[pt.as_tensor_variable(np.linspace(0, 1, iterrations))],  # 'i'
                         outputs_info=[n_init],  # 'n'  
                         strict=True )

This works fine and at the moment I just set iterations to a large number and allow it to run. However, this is obviously not able to do any convergence checking, so if the solution converges within a smaller number of iterations, it isn’t able to stop the loop. My question is is there a way to implement this in pytensor scan/ is there something else I should use?

Thanks !!!

Your code looks great! Obviously it’s simple, but it looks like it’s doing the right thing. There’s some example code for doing this a bit fancier way here.

Pytensor supports while loops, via until conditions, so you could check for convergence and terminate early. Docs here. The downside is that you can’t compile while scans to jax, but you can still use numba and C.

I’m also working on optimization in pytensor here, using implicit gradients wrapped around scipy.optimize. That should be done soonish, but I can’t give a timeline.

1 Like

Thanks that’s exactly what I was looking for!!! And the scipy optamize would be super useful, look forward to it!

The optimizers are merged, so if you need to symbolically call scipy.optimize.root, you can now this this via from pytensor.tensor.optimize import root

The API is a bit different from scipy. You will provide the equations (the symbolic vector of equations you want the roots for) and variables (a list of symbolic tenors with the variables to be solved with respect to). Pytensor will automatically work out if there are any additional parameters in the equations (the args in the scipy API), and will also compute the jacobian of the system (if requested). Here’s an example:

import pytensor
import pytensor.tensor as pt
from pytensor.tensor.optimize import root

a, b, c, d, e, f = pt.dscalars('a b c d e f'.split())
x = pt.dvector('x', shape=(2, ))

# System is a parabola and a sideways parabola. If all parameters are 1, then:
# x **2 - y - 1, x - y **2 + 1
equations = pt.stack([a * x[0] ** 2 - b * x[1] - c,  d * x[0] - e * x[1] ** 2 + f])

solutn, success = root(equations=equations, variables=x,
                       method='hybr',
                       optimizer_kwargs={'tol':1e-8})

fn = pytensor.function([x, a, b, c, d, e, f],
                       [solutn, success])

fn([0., 0.], 1, 1, 1, 1, 1, 1) # [array([-0.61803399, -0.61803399]), np.True_]

Since the whole thing is symbolic, you can do fun stuff like vectorize the solver over lots of x inputs:

from pytensor.graph.replace import vectorize_graph

x_grid = pt.dmatrix('x', shape=(None, 2))

grid_of_solutions = vectorize_graph([solutn, success], 
                                    {x:x_grid})
fn2 = pytensor.function([x_grid, a, b, c, d, e, f],
                        grid_of_solutions)

# Set up a grid of starting values and find the roots for each:
import numpy as np
x_values = np.linspace(-1, 1, 20)
xx, yy = np.meshgrid(x_values, x_values)
x_in = np.c_[xx.ravel(), yy.ravel()]
solution_vals, success_flags = fn2(x_in, 1, 1, 1, 1, 1, 1)

# Categorize the roots (there should be 4 -- (-1, 0), (0, -1), (-0.618, -0.618), and (1.618, 1.618).
unique_solutions = np.unique(np.round(solution_vals, 3), axis=0)
solution_ids = {tuple(v.tolist()): k for k, v in enumerate(unique_solutions)}

# Make a nice plot
import matplotlib.pyplot as plt
fig, ax = plt.subplots(subplot_kw={'aspect':'equal'})

x_plot = np.linspace(-2, 2, 1000)
ax.plot(x_plot, x_plot ** 2 - 1)
ax.plot(x_plot, np.sqrt(x_plot + 1), color='tab:orange')
ax.plot(x_plot, -np.sqrt(x_plot + 1), color='tab:orange')
ax.axhline(0, ls='--', c='k', lw=0.5)
ax.axvline(0, ls='--', c='k', lw=0.5)

colors = ['tab:blue', 'tab:orange', 'tab:green', 'tab:purple']

for (x0, solution) in zip(x_in, solution_vals):
    solution_id = solution_ids[tuple(np.round(solution, 3).tolist())]
    
    ax.scatter(*x0, facecolor=colors[solution_id], edgecolor='none', alpha=0.25)
    ax.scatter(*solution, color='tab:red', zorder=1000)
    ax.annotate(xy=solution, xytext=x0, text='', arrowprops={'arrowstyle':'->', 'linewidth':0.5, 'alpha':0.5})
plt.show()

And you can also ask for the gradients of the roots given changes in the parameters:

# See how the optimum x-value responds to dx nudge in parameters
d_theta = pt.grad(solutn[0], [a, b, c, d, e, f])
d_theta_vec = pt.stack(d_theta)
f_d_theta = pytensor.function([x, a, b, c, d, e, f], d_theta_vec)

f_d_theta([0., 0.], 1, 1, 1, 1, 1, 1)
# array([ 0.89442719,  1.4472136 , -2.34164079, -1.17082039, -0.7236068 , 1.89442719])

# The gradients are always sensitive to the parameters, but only *maybe* sensitive to the 
# initial x values. That depends on which root the initial values imply.
f_d_theta([-0.8, -0.8], # Go to the (-0.618, -0.618) root again, so the gradients are the same
          1, 1, 1, 1, 1, 1)
# array([ 0.89442719,  1.4472136 , -2.34164079, -1.17082039, -0.7236068 , 1.89442719])

# If we choose values we know go to a different root, we expect different gradients
f_d_theta([0.8, 0.8], # Choose the (1.618,  1.618) root
          1, 1, 1, 1, 1, 1)
# array([-0.89442719,  0.5527864 ,  0.34164079,  0.17082039, -0.2763932 , 0.10557281])
1 Like