Custom PyTensor Op to calculate an analytical gradient

Hello,

I would like to implement a custom PyTensor Op to give me out a value, EA, while taking an analytical gradient of this function and passing it to pymc to continue forward. The gradient of the function I want to use has been implemented in this paper (Equations A5 and A6)

Here:
\begin{align*} \frac{\partial E}{\partial M} &= \frac{1}{1-e\cos E(t)} \end{align*}

and

\begin{align*} \frac{\partial E}{\partial e} &= \frac{\sin E(t)}{1-e\cos E(t)} \end{align*}
The current function I already have is:

class KeplerSolverOp(Op):
    __props__ = ()

    def make_node(self, M, e):
        M = as_tensor_variable(M)
        e = as_tensor_variable(e)

        return Apply(self, [M, e], [M.type(), e.type()])

    def perform(self, node, inputs, output_storage):
        M, e = inputs
        EA = _calc_ecc_anom(M, e)  # Compute ecc-anomly
        output_storage[0][0] = EA

    def grad(self, inputs, output_grads):
        M, e = inputs

        EA = _calc_ecc_anom(M, e)
        sea = pm.math.sin(EA)
        cea = pm.math.cos(EA)
        temp = 1 - e * cea
        dEA_dM = 1.0 / temp
        dEA_de = (sea * EA) / temp

        output_grads[0] = dEA_dM
        output_grads[1] = dEA_de

        return [dEA_dM, dEA_de]

The problem is that the output of “KeplerSolverOp” is a not an individual value that i’m expecting. As I was expecting to call it like:

kepler_solver_op = KeplerSolverOp()
eanom = kepler_solver_op(manom, ecc)

What could I be missing?

You defined an Op that has two outputs, M.type() and e.type(), so calling the Op returns two variables

Thank you for the fast response! If I want the output of my "KeplerSolverOp " to be “EA” from

EA = _calc_ecc_anom(M, e)

and the gradients
\begin{align*} \frac{\partial E}{\partial M} &= \frac{1}{1-e\cos E(t)} \end{align*}
and
\begin{align*} \frac{\partial E}{\partial e} &= \frac{\sin E(t)}{1-e\cos E(t)} \end{align*}
sent to PyMC to continue working then I should write the make_node like:

    def make_node(self, M, e):
        M = as_tensor_variable(M)
        e = as_tensor_variable(e)
        EA = _calc_ecc_anom(M, e)
        return Apply(self, [M, e], [EA])

When I do that I get:

Blockquote
Traceback (most recent call last):
File “/Users/jrob/Desktop/School/NU/Research/Jason/gj504/pymc_GJ504.py”, line 77, in
raoff, deoff = kepler.calc_orbit(epoch, sma, ecc, inc, aop, pan, tau, plx, mtot, tau_ref_epoch=epoch[0])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/jrob/Desktop/School/NU/Research/Jason/gj504/kepler.py”, line 227, in calc_orbit
f = pt.function([manom,ecc], KeplerSolverOp()(manom,ecc))
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/jrob/miniconda3/envs/pymc/lib/python3.11/site-packages/pytensor/graph/op.py”, line 293, in call
node = self.make_node(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/jrob/Desktop/School/NU/Research/Jason/gj504/kepler.py”, line 75, in make_node
return Apply(self, [M, e], [EA])
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/jrob/miniconda3/envs/pymc/lib/python3.11/site-packages/pytensor/graph/basic.py”, line 158, in init
raise ValueError(
ValueError: All output variables passed to Apply must belong to it.

In make node you just specify the type of the input and output variables, your output is doing more than that (there shouldn’t be a need for a calling _calc_ecc_anom in it)

Type means an instance of tensor(shape=..., dtype=...) That’s all PyTensor needs to know about the output at the make_node method. Actual computation should be defined in the perform method

And your gradient must have only PyTensor pure operations, so you may need to call self if you need the output of the Op to define the gradient.

Perhaps this guided will clarify some of the things: Using a “black box” likelihood function — PyMC example gallery

I haven’t seen this yet. I’ll look through it and check back here when I implement it!

Alrighty, so I reorgianized my Custom Op’s to now look like this

class KeplerSolverOp(Op):
    __props__ = ()

    def make_node(self, M, e):
        #setting inputs to function of M and e
        M = at.as_tensor(M)
        e = at.as_tensor_variable(e)

        #setting input list for Apply
        inputs = [M, e]
        outputs = [M.type()]

        return Apply(self, inputs, outputs)

    def perform(self, node, inputs, outputs):
        M, e = inputs
        EA = _calc_ecc_anom(M, e)  # Compute ecc-anomly
        outputs[0][0] = EA #may have to add something likept.value...


    def grad(self, inputs, out_grad):
        M, e = inputs

        grad_wrt_m, grad_wrt_e = kepler_loglikegrad(M,e)

        out_grad = grad_wrt_m, grad_wrt_e
        # print(grad_wrt_m, grad_wrt_e)
        return[grad_wrt_m,grad_wrt_e]

    
class LogLikeGrad(Op):
    def make_node(self, M, e):
        #setting inputs to function of M and e
        M = at.as_tensor(M)
        e = at.as_tensor_variable(e)

        #setting input list for Apply
        inputs = [M, e]

        # There are two outputs with the same type as data,
        # for the partial derivatives wrt to m, c
        outputs = [M.type(), M.type()]

        return Apply(self, inputs, outputs)

    def perform(self, node, inputs, outputs):
        M, e = inputs

        EA = _calc_ecc_anom(M, e)
        sea = pm.math.sin(EA)
        cea = pm.math.cos(EA)
        temp = 1 - e * cea
        dEA_dM = 1.0 / temp
        dEA_de = (sea * EA) / temp

        # return [grad_M, grad_e]
        outputs[0][0] = dEA_dM
        outputs[1][0] = dEA_de

keplergrad = KeplerSolverOp()
kepler_loglikegrad = LogLikeGrad()

It seems to run, but I get a segfault. I don’t see a “core” to look at to even investigate it. I wonder what could be consuming so much RAM? (If that’s the true answer)

One thing I have tried is to run

import numpy as np
import pytensor.tensor as at
import pymc as pm
from pytensor.graph.op import Op, Apply
from pytensor.tensor import grad, as_tensor_variable
from pytensor import function
import pytensor as pt


def _calc_ecc_anom(manom, ecc):
    """
    Computes the eccentric anomaly from the mean anomlay.
    Code from Rob De Rosa's orbit solver (e < 0.95 use Newton, e >= 0.95 use Mikkola)

    Args:
        manom (float/np.array): mean anomaly, either a scalar or np.array of any shape
        ecc (float/np.array): eccentricity, either a scalar or np.array of the same shape as manom

Return:
        eanom (float/np.array): eccentric anomalies, same shape as manom

    Written: Jason Wang, 2018
    """
    # print(type(ecc))
    alpha = (1.0 - ecc) / ((4.0 * ecc) + 0.5)
    beta = (0.5 * manom) / ((4.0 * ecc) + 0.5)

    aux = pm.math.sqrt(beta**2.0 + alpha**3.0) 
    z = pm.math.abs(beta + aux)**(1.0/3.0) 

    s0 = z - (alpha/z)
    s1 = s0 - (0.078*(s0**5.0)) / (1.0 + ecc)
    e0 = manom + (ecc * (3.0*s1 - 4.0*(s1**3.0)))

    se0 = pm.math.sin(e0)
    ce0 = pm.math.cos(e0)

    f = e0-ecc*se0-manom

    f1 = 1.0-ecc*ce0
    f2 = ecc*se0
    f3 = ecc*ce0
    f4 = -f2
    u1 = -f/f1
    u2 = -f/(f1+0.5*f2*u1)
    u3 = -f/(f1+0.5*f2*u2+(1.0/6.0)*f3*u2*u2)
    u4 = -f/(f1+0.5*f2*u3+(1.0/6.0)*f3*u3*u3+(1.0/24.0)*f4*(u3**3.0))

    return (e0 + u4)

class KeplerSolverOp(Op):
    __props__ = ()

    def make_node(self, M, e):
        #setting inputs to function of M and e
        M = at.as_tensor(M)
        e = at.as_tensor_variable(e)

        #setting input list for Apply
        inputs = [M, e]
        outputs = [M.type()]

        return Apply(self, inputs, outputs)

    def perform(self, node, inputs, outputs):
        M, e = inputs
        EA = _calc_ecc_anom(M, e)  # Compute ecc-anomly
        outputs[0][0] = EA #may have to add something likept.value...


    def grad(self, inputs, out_grad):
        M, e = inputs

        grad_wrt_m, grad_wrt_e = kepler_loglikegrad(M,e)

        out_grad = grad_wrt_m, grad_wrt_e
        # print(grad_wrt_m, grad_wrt_e)
        return[grad_wrt_m,grad_wrt_e]

    
class LogLikeGrad(Op):
    def make_node(self, M, e):
        #setting inputs to function of M and e
        M = at.as_tensor(M)
        e = at.as_tensor_variable(e)

        #setting input list for Apply
        inputs = [M, e]

        # There are two outputs with the same type as data,
        # for the partial derivatives wrt to m, c
        outputs = [M.type(), M.type()]

        return Apply(self, inputs, outputs)

    def perform(self, node, inputs, outputs):
        M, e = inputs

        EA = _calc_ecc_anom(M, e)
        sea = pm.math.sin(EA)
        cea = pm.math.cos(EA)
        temp = 1 - e * cea
        dEA_dM = 1.0 / temp
        dEA_de = (sea * EA) / temp

        # return [grad_M, grad_e]
        outputs[0][0] = dEA_dM
        outputs[1][0] = dEA_de

keplergrad = KeplerSolverOp()
kepler_loglikegrad = LogLikeGrad()

M = at.scalar("m")
e = at.scalar("e")

out = keplergrad(M,e)
eval_out = out.eval({M: 2.96466117, e: 0.4781471212028443})
print(eval_out.eval())

and I got no errors and received “3.021801889806251” which is correct…

Also I have tried

out = keplergrad(M,e)

grad_wrt_m, grad_wrt_e = pt.grad(out.sum(), wrt=[M, e])
pt.dprint([grad_wrt_m], print_type=True)

and received

LogLikeGrad.0 [id A] <Scalar(float64, shape=())>
 ├─ m [id B] <Scalar(float64, shape=())>
 └─ e [id C] <Scalar(float64, shape=())>

Your perform method in the gradient op should have only python code. You’re using pm.math which is not correct. If you want to use those you would do it in the grad method of the original op (The grad method in contrast to the perform only uses PyTensor code)

Perfect! That was the solution. For anybody else that comes across this, the solution is to remove any pm.math functions and separate it into 2 Ops!

Thank you so much for the help!

1 Like