The optimizers are merged, so if you need to symbolically call scipy.optimize.root
, you can now this this via from pytensor.tensor.optimize import root
The API is a bit different from scipy. You will provide the equations
(the symbolic vector of equations you want the roots for) and variables
(a list of symbolic tenors with the variables to be solved with respect to). Pytensor will automatically work out if there are any additional parameters in the equations (the args
in the scipy API), and will also compute the jacobian of the system (if requested). Here’s an example:
import pytensor
import pytensor.tensor as pt
from pytensor.tensor.optimize import root
a, b, c, d, e, f = pt.dscalars('a b c d e f'.split())
x = pt.dvector('x', shape=(2, ))
# System is a parabola and a sideways parabola. If all parameters are 1, then:
# x **2 - y - 1, x - y **2 + 1
equations = pt.stack([a * x[0] ** 2 - b * x[1] - c, d * x[0] - e * x[1] ** 2 + f])
solutn, success = root(equations=equations, variables=x,
method='hybr',
optimizer_kwargs={'tol':1e-8})
fn = pytensor.function([x, a, b, c, d, e, f],
[solutn, success])
fn([0., 0.], 1, 1, 1, 1, 1, 1) # [array([-0.61803399, -0.61803399]), np.True_]
Since the whole thing is symbolic, you can do fun stuff like vectorize the solver over lots of x inputs:
from pytensor.graph.replace import vectorize_graph
x_grid = pt.dmatrix('x', shape=(None, 2))
grid_of_solutions = vectorize_graph([solutn, success],
{x:x_grid})
fn2 = pytensor.function([x_grid, a, b, c, d, e, f],
grid_of_solutions)
# Set up a grid of starting values and find the roots for each:
import numpy as np
x_values = np.linspace(-1, 1, 20)
xx, yy = np.meshgrid(x_values, x_values)
x_in = np.c_[xx.ravel(), yy.ravel()]
solution_vals, success_flags = fn2(x_in, 1, 1, 1, 1, 1, 1)
# Categorize the roots (there should be 4 -- (-1, 0), (0, -1), (-0.618, -0.618), and (1.618, 1.618).
unique_solutions = np.unique(np.round(solution_vals, 3), axis=0)
solution_ids = {tuple(v.tolist()): k for k, v in enumerate(unique_solutions)}
# Make a nice plot
import matplotlib.pyplot as plt
fig, ax = plt.subplots(subplot_kw={'aspect':'equal'})
x_plot = np.linspace(-2, 2, 1000)
ax.plot(x_plot, x_plot ** 2 - 1)
ax.plot(x_plot, np.sqrt(x_plot + 1), color='tab:orange')
ax.plot(x_plot, -np.sqrt(x_plot + 1), color='tab:orange')
ax.axhline(0, ls='--', c='k', lw=0.5)
ax.axvline(0, ls='--', c='k', lw=0.5)
colors = ['tab:blue', 'tab:orange', 'tab:green', 'tab:purple']
for (x0, solution) in zip(x_in, solution_vals):
solution_id = solution_ids[tuple(np.round(solution, 3).tolist())]
ax.scatter(*x0, facecolor=colors[solution_id], edgecolor='none', alpha=0.25)
ax.scatter(*solution, color='tab:red', zorder=1000)
ax.annotate(xy=solution, xytext=x0, text='', arrowprops={'arrowstyle':'->', 'linewidth':0.5, 'alpha':0.5})
plt.show()
And you can also ask for the gradients of the roots given changes in the parameters:
# See how the optimum x-value responds to dx nudge in parameters
d_theta = pt.grad(solutn[0], [a, b, c, d, e, f])
d_theta_vec = pt.stack(d_theta)
f_d_theta = pytensor.function([x, a, b, c, d, e, f], d_theta_vec)
f_d_theta([0., 0.], 1, 1, 1, 1, 1, 1)
# array([ 0.89442719, 1.4472136 , -2.34164079, -1.17082039, -0.7236068 , 1.89442719])
# The gradients are always sensitive to the parameters, but only *maybe* sensitive to the
# initial x values. That depends on which root the initial values imply.
f_d_theta([-0.8, -0.8], # Go to the (-0.618, -0.618) root again, so the gradients are the same
1, 1, 1, 1, 1, 1)
# array([ 0.89442719, 1.4472136 , -2.34164079, -1.17082039, -0.7236068 , 1.89442719])
# If we choose values we know go to a different root, we expect different gradients
f_d_theta([0.8, 0.8], # Choose the (1.618, 1.618) root
1, 1, 1, 1, 1, 1)
# array([-0.89442719, 0.5527864 , 0.34164079, 0.17082039, -0.2763932 , 0.10557281])