PyTensor scan of an array with PyMC sampling

Hi all, I am new to PyMC and PyTensor and don’t fully understand how to properly implement Tensor Variables in my code. For my current project I need to find the root of an equation ( fn=n_rootfinder in my code) which takes a scalar, k, and an numpy array, G, as inputs. To perform the root finding I am using scipy.optimize.root which requires me to loop over the numpy array G and gives me a root for each G value. To use this with PyTensor I know that I have to use the scan function in place of a for loop.
My implementation is:

import pytensor.tensor as at 
from pytensor import * 
from pytensor.graph.op import Op 
from pytensor.graph.basic import Apply
from scipy.optimize import root

G = np.array([4.37618934e+19, 7.43831156e+19, 1.25527991e+20, 2.09732030e+20,
        3.57391678e+20, 6.22279947e+20, 1.09102152e+21, 1.95899744e+21,
        3.23156771e+21, 5.11102951e+21, 7.86538338e+21, 1.14807970e+22,
        1.61837741e+22, 2.20970762e+22])

with pm.Model() as model:
    k = pm.Normal('k', 279220, 1)

    #result = n_rootfinder(G[0], k)
    result, _ = pytensor.scan(fn=n_rootfinder, sequences=at.as_tensor_variable(G),
                              non_sequences=k)
    likelihood_fn = pm.Normal('Likelihood', result, sigma=0.5, observed=Exp_Data)
    trace = pm.sample()
   

The sampler runs successfully when I have only a single value of G but fails when implementing the scan over the array.This is the error that I get:

error: Result from function call is not a proper array of floats.
Apply node that caused the error: RootFinder(Composite{...}.0, k_ex, Composite{...}.0, Composite{...}.0, Composite{...}.0, Composite{...}.0)
Toposort index: 11
Inputs types: [TensorType(float64, shape=()), TensorType(float64, shape=()), TensorType(float64, shape=()), TensorType(float64, shape=()), TensorType(float64, shape=()), TensorType(float64, shape=())]
Inputs shapes: [(), (), (), (), (), ()]
Inputs strides: [(), (), (), (), (), ()]
Inputs values: [array(1133136.4893891), array(2000.), array(1.13314845e-11), array(1.13314845e-10), array(1.13314845e-28), array(0.5)]
Outputs clients: [[ExpandDims{axis=0}(RootFinder.0)]]

I suspect my implementation of scan might not be correct but do not understand why the error occurs. Any help would be very much appreciated.

Could you share n_rootfinder?

Of course! n_rootfinder calls a PyTensor Op object called RootFinder to initialise the object and then again to set the variable and perform the root finding using another function, root_finding. This function contains the equation to solve and uses scipy.optimize.root to solve for the value n

def root_finding(G, k):

    def fn_to_solve(n, G, k):
        return G - (k*n**2)

    sol = root(fn_solve, 1e15, method='hybr', args=(G, k))
    return sol.x

class RootFinder(Op):

    def __init__(self, G):

        self.G = G

    def make_node(self, k):

        outputs = [at.scalar(dtype='float64')]
        return Apply(self, [k], outputs)

    def perform(self, node, inputs, outputs_storage):
        k = inputs

        outputs_storage[0][0] = root_finding(self.G, k)

def n_rootfinder(G, k):

    rootfinder = RootFinder(G)

    n = rootfinder(k)

    return n

As stated in my post, this executes sucessfully with only one value for G (and hence one value for n)

I’ll take a look! In the meantime, is this a simplified form of the problem or the problem you actually want to solve? If you are doing scalar optimization, it would be faster and more stable to use root_scalar and do some form of bisection search.

I refactorized your Op a little bit, here’s where I landed:

import pytensor.tensor as pt
import pytensor
from pytensor.graph.op import Op 
from pytensor.graph.basic import Apply
from scipy import optimize
import pymc as pm
import numpy as np

        
class RootFinder(Op):
    __props__ = ()

    itypes = [pt.dscalar, pt.dscalar, pt.dscalar]
    otypes = [pt.dscalar]
    
    def __init__(self, objective_fn, method, optim_kwargs=None):
        self.objective_fn = objective_fn
        self.method = method
        self.optim_kwargs = {} if optim_kwargs is None else optim_kwargs
    
    def infer_shape(self, fgraph, node, i0_shapes):
        return [i0_shapes[-1]]
    
    def perform(self, node, inputs, outputs_storage):
        G, k, x0 = inputs
        res = optimize.root(self.objective_fn, 
                            x0, 
                            method=self.method,
                            args=(G, k),
                            **self.optim_kwargs)
        
        outputs_storage[0][0] = res.x[0]

        
G = np.array([4.37618934e+19, 7.43831156e+19, 1.25527991e+20, 2.09732030e+20,
        3.57391678e+20, 6.22279947e+20, 1.09102152e+21, 1.95899744e+21,
        3.23156771e+21, 5.11102951e+21, 7.86538338e+21, 1.14807970e+22,
        1.61837741e+22, 2.20970762e+22])
        
with pm.Model() as model:
    def objective(n, G, k):
        return G - (k * n ** 2)

    k = pm.Normal('k', 279220, 1)
    x0 = 1e15
    root_op = RootFinder(objective, 'hybr')
    result, _ = pytensor.scan(root_op, sequences=[G], non_sequences=[k, x0])
    y_hat = pm.Normal('Likelihood', result, sigma=0.5)
    idata = pm.sample_prior_predictive()

I don’t often use the make_node API, I like to use itypes and otypes. You also had a bit of unnecessary complexity. If you asked me exactly which change fixed it… I couldn’t tell you u_u

1 Like

Hi! Thank you for the very quick response! To answer your first question, I was using the RootFinder Op for different functions, some of which are more complex. The vector root finding was just the one that I found to work with all of the functions I was working with.

I have implemented the changes to the Op and my sampling is working now as expected so thank you very much for your help. As I said in my initial post I am quite new to PyTensor so I was not surprised to see the issue in the Op. Perhaps the fault was in the make_node API

Just a couple things to pay attention to:

  1. The output shape is defined only by x0 (the initial values for the optimizer), that’s why the infer_shape function returns [io_shapes[-1]]
  2. Scan passes variables into the inner function in a fixed order: recursive inputs first (defined in outputs_info, then sequences (defined in sequences), then non-sequences (defined in non_sequences). That’s why the inputs to perform are unpacked in the order they are, and also why x0 is set as the last input.
  3. You could add additional outputs, for example res.success, buy adding more otypes.

Hope it’s clear enough that you can extend/adjust it to more complex situations.

1 Like

I kept working on this a bit, because I’m quite interested in mixing optimziers into PyMC workflow. In specific, I wanted to know if it’s possible to generalize the optimizer function. Here’s what I came up with. You can also eliminate the scan completely by using pt.vectorize. It might be more performant:

import pytensor.tensor as pt
import pytensor
from pytensor.graph.op import Op 
from pytensor.graph.basic import Apply
from scipy import optimize
import pymc as pm
import numpy as np


def root_op_factory(x0, *args):
    
    class RootFinder(Op):
        __props__ = ()

        itypes = [x.type for x in args] + [x0.type]
        otypes = [x0.type]
        
        def __init__(self, objective_fn, method, optim_kwargs=None):
            self.objective_fn = objective_fn
            self.method = method
            self.optim_kwargs = {} if optim_kwargs is None else optim_kwargs

        def infer_shape(self, fgraph, node, i0_shapes):
            return [i0_shapes[-1]]

        def perform(self, node, inputs, outputs_storage):
            *args, x0 = inputs
            res = optimize.root(self.objective_fn, 
                                x0, 
                                method=self.method,
                                args=tuple(args),
                                **self.optim_kwargs)

            outputs_storage[0][0] = res.x[0]
    
    return RootFinder
        
G = np.array([4.37618934e+19, 7.43831156e+19, 1.25527991e+20, 2.09732030e+20,
        3.57391678e+20, 6.22279947e+20, 1.09102152e+21, 1.95899744e+21,
        3.23156771e+21, 5.11102951e+21, 7.86538338e+21, 1.14807970e+22,
        1.61837741e+22, 2.20970762e+22])
        
with pm.Model() as model:
    def objective(n, G, k):
        return G - (k * n ** 2)
    
    G_pt = pm.ConstantData('G', G)
    k = pm.Normal('k', 279220, 1)
    x0 = pt.as_tensor(1e15)
    RF = root_op_factory(x0, G_pt[0], k)
    root_op = pt.vectorize(RF(objective, 'hybr'), '(),(),()->()')
    mu = root_op(G_pt, k, x0)
    y_hat = pm.Normal('Likelihood', mu=mu, sigma=0.5)
    idata = pm.sample_prior_predictive()