Custom theano Op to do numerical integration

I am using PyMC3 for parameter estimation using a particular likelihood function which has to be defined. Here L is the analytic form of my Likelihood function. I have some observational data for the radial velocity(vr) and postion ® for some objects, which is imported from excel file. My priors are M and beta which is assumed to be a uniform distribution.I want to incorporate an integral integ(gamma, beta)into my likelihood function that actually is a function of one of my parameter. I tried using Custom Theano Op to do this. Seems like the function can’t return a number. The following is the code and the error message I am getting:

import pymc3 as pm
from pymc3 import find_MAP
import theano
import theano.tensor as tt
import pymc3 as pm
from pymc3 import find_MAP
import theano
import theano.tensor as tt
import pandas
import random as rm
import decimal as dm
data_ = np.array(pandas.read_excel('aaa.xlsx',header=None))
gamma=3.77;

G = 4.302*10**-6;
rmin = 3.0;
R = 95.7;
G = 4.302*10**-6;
rmin = 3.0;
R = 95.7;


class integrateOut(theano.Op):
    def __init__(self, f, t, t0, tf,*args, **kwargs):
        super(integrateOut,self).__init__()
        self.f = f
        self.t = t
        self.t0 = t0
        self.tf = tf

    def make_node(self, *inputs):
        self.fvars=list(inputs)
        try:
             self.gradF = tt.grad(self.f, self.fvars)
        except:
            self.gradF = None
        return theano.Apply(self, self.fvars, [tt.dscalar().type()])

    def perform(self,node, inputs, output_storage):
        args = tuple(inputs)
        f = theano.function([self.t] + self.fvars,self.f,mode='DebugMode')
        output_storage[0][0] = quad(f, self.t0, self.tf, args=args)[0]

    def grad(self,inputs,grads):
        return [integrateOut(g, self.t, self.t0, self.tf)(*inputs)*grads[0]\
                                                                 for g in self.gradF]


def integ(gamma, beta):
    z = tt.cos(theta)**(2*((gamma/(beta - 2)) - 3/2) + 3)    
    return integrateOut(z, theta, -(np.pi)/2, (np.pi)/2)(beta)
with pm.Model() as basic_model:
    M = pm.Uniform('M', lower=10**8, upper=10**13)
    beta = pm.Uniform('beta', lower=2.001, upper=2.999)
    theta = pm.Normal('theta', 0, 10**2)
    M_print = tt.printing.Print('M')(M)
    bet_print = tt.printing.Print('beta')(beta)
    def logp_func(rn,vn):
        q = (gamma/(beta - 2)) - 3/2
        B = (G*M) / ((beta -2 )*(R**(3 - beta)))
        K = (gamma - 3) / ((rmin**(3 - gamma)) * (2*B)**0.5) * integ(gamma, beta)
        logp = - tt.log(K*((1 -((1/(2*B))*((vn**2)*rn**(beta - 
                                           2))))**(q+1))*(rn**(1-gamma +(beta/2))))
        return tt.sum(logp)
    logpvar = pm.DensityDist("logpvar", logp_func, observed={"rn": rn,"vn":vn})
    trace = pm.sample(1000, tune=1000)
    print(pm.summary(trace))
    map_estimate = pm.find_MAP(model=basic_model)
    print(map_estimate)

But this returns an error message:

         NotImplementedError: input nd
        Apply node that caused the error: InplaceDimShuffle{x}(<__main__.integrateOut object at 
        0x0000008F108681D0>.0)
          Toposort index: 11
         Inputs types: [TensorType(float64, scalar)]
         Inputs shapes: ['No shapes']
         Inputs strides: ['No strides']
         Inputs values: [0.6348844212308064]
         Outputs clients: [[Elemwise{Composite{log(((i0 * i1 * ((i2 - ((i3 * i4 * i5 * i6 * (i7 ** i4)) / i8)) ** i9) 
         *   (i7 ** i10)) / i11))}}(TensorConstant{(1,) of 1...9421338119}, InplaceDimShuffle{x}.0, 
         TensorConstant{(1,) of 1.0}, TensorConstant{(1,) of 11..225.011623},   
         Elemwise{add,no_inplace}.0, Elemwise{Composite{(i0 ** (i1 - i2))}}.0, TensorConstant{[  
             1.26045..73344e+04]}, TensorConstant{[  2.7405 ..  99.528 ]}, InplaceDimShuffle{x}.0, 
           Elemwise{Composite{(i0 + (i1 / i2))}}.0, Elemwise{Composite{(i0 + (i1 * i2))}}.0, 
           Elemwise{Composite{sqrt(((i0 * i1) / (i2 * i3)))}}.0)]]

         HINT: Re-running with most Theano optimization disabled could give you a back-trace of when 
         this node was created. This can be done with by setting the Theano flag 
         'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 
         'optimizer=None'.

          HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map 
          footprint of this apply node.

2 Likes

I try to have a look at the code but it is quite badly formatted :sweat_smile:
Could you please format it and put them in a markdown code block?

Also, maybe you can try to find out if there is a closed-form expression of the integration or some reparameterization trick to express the integration differently. It might ease some difficulty to have to debug theano.

I’m sorry. I have formatted it . I got the Custom theano Op for integration from this link: https://stackoverflow.com/questions/42678490/custom-theano-op-to-do-numerical-integration.

I dont think you should define theta as a Normal distribution, as you are integrating it out between [-pi, pi] anyway.

The integration seems to working but I still hitting a different error… my attempt below:

with pm.Model() as basic_model:
    M = pm.Uniform('M', lower=10**8, upper=10**13)
    beta = pm.Uniform('beta', lower=2.001, upper=2.999, testval=2.5)
    # theta = pm.Normal('theta', 0, 10**2)
    theta = tt.dscalar('theta')
    theta.tag.test_value=0
    def logp_func(rn, vn):
        q = (gamma/(beta - 2)) - 3/2
        B = (G*M) / ((beta -2 )*(R**(3 - beta)))
        z1 = tt.cos(theta)**(2*((gamma/(beta - 2)) - 3/2) + 3)
        intZ = integrateOut(z1, theta, -(np.pi)/2, (np.pi)/2)(beta)
        integ = tt.printing.Print('integ')(intZ)
        
        K = (gamma - 3) / ((rmin**(3 - gamma)) * (2*B)**0.5) * intZ
        logp = - tt.log(K*((1 -((1/(2*B))*((vn**2)*rn**(beta - 
                                           2))))**(q+1))*(rn**(1-gamma +(beta/2))))
        return tt.sum(logp)
    
    logpvar = pm.DensityDist("logpvar", logp_func, observed={"rn": rn, "vn":vn})
    trace = pm.sample()

For me it is the same error.

I think I worked out how to do that. I implemented a more general integration op:

from scipy.integrate import quad
import theano
import theano.tensor as tt
import numpy as np


class Integrate(theano.Op):
    def __init__(self, expr, var, *extra_vars):
        super().__init__()
        self._expr = expr
        self._var = var
        self._extra_vars = extra_vars
        self._func = theano.function(
            [var] + list(extra_vars),
            self._expr,
            on_unused_input='ignore')
    
    def make_node(self, start, stop, *extra_vars):
        self._extra_vars_node = extra_vars
        assert len(self._extra_vars) == len(extra_vars)
        self._start = start
        self._stop = stop
        vars = [start, stop] + list(extra_vars)
        return theano.Apply(self, vars, [tt.dscalar().type()])
    
    def perform(self, node, inputs, out):
        start, stop, *args = inputs
        val = quad(self._func, start, stop, args=tuple(args))[0]
        out[0][0] = np.array(val)
        
    def grad(self, inputs, grads):
        start, stop, *args = inputs
        out, = grads
        replace = dict(zip(self._extra_vars, args))
        
        replace_ = replace.copy()
        replace_[self._var] = start
        dstart = out * theano.clone(-self._expr, replace=replace_)
        
        replace_ = replace.copy()
        replace_[self._var] = stop
        dstop = out * theano.clone(self._expr, replace=replace_)

        grads = tt.grad(self._expr, self._extra_vars)
        dargs = []
        for grad in grads:
            integrate = Integrate(grad, self._var, self._extra_vars)
            darg = out * integrate(start, stop, *args)
            dargs.append(darg)
            
        return [dstart, dstop] + dargs

    
## Basic usage

# We define the function we want to integrate
x = tt.dscalar('x')
x.tag.test_value = np.zeros(())
a = tt.dscalar('a')
a.tag.test_value = np.ones(())

func = a ** 2 * x**2
integrate = Integrate(func, x, a)

# Check gradients
from theano.tests.unittest_tools import verify_grad
verify_grad(integrate, (np.array(0.), np.array(1.), np.array(2.)))
verify_grad(integrate, (np.array(-2.), np.array(5.), np.array(8.)))


# Now, we define values for the integral
start = tt.dscalar('start')
start.tag.test_value = np.zeros(())
stop = tt.dscalar('stop')
stop.tag.test_value = np.ones(())
a_ = tt.dscalar('a_')
a_.tag.test_value = np.ones(())

# Note, that a_ != a
val = integrate(start, stop, a_)

# Evaluate the integral and derivatives
val.eval({start: 0., stop: 1., a_: 2.})
tt.grad(val, a_).eval({start: -2, stop: 1, a_: 2.})
tt.grad(val, start).eval({start: 1., stop: 2., a_: 2.})

You can evaluate the integral and its derivatives like this:

val.eval({start: 0., stop: 1., a_: 2.})
tt.grad(val, a_).eval({start: -2, stop: 1, a_: 2.})

You can use this in PyMC3 like this:

import pymc3 as pm

## Usage in PyMC3
with pm.Model() as model:
    start = pm.Normal('start', -5, 1)
    stop = pm.Normal('stop', 5, 1)
    a = pm.Normal('a', 0.5, 1)
    
    # Define the function to integrate in plain theano
    x = tt.dscalar('x_')
    x.tag.test_value = np.zeros(())
    a_ = tt.dscalar('a_')
    a_.tag.test_value = np.ones(())

    func = a_ ** 2 * x**2
    integrate = Integrate(func, x, a_)

    # Now we plug in the values from the model.
    # The `a_` from above corresponds to the `a` here.
    val = integrate(start, stop, a)
    pm.Normal('y', mu=val, sd=1, observed=10)

You might want to test the integration Op a bit more, I only did some very basic checks. Just plug in different functions and try to break it. If you find a problem I’d like to hear about it :slightly_smiling_face:

If you have trouble with this in your use-case, feel free to ask.

3 Likes

Thanks! I will check

@aseyboldt Thank you for your code. I am running into an error when using your integration class.

Granted my integration is significantly more complicated than your example.

I am trying to recreate the Graphical Model for a Stop-Signal Reaction Time analysis found in the paper –http://dora.erbe-matzke.com/papers/Matzkeetal2012.pdf – There is some old code I found here -https://github.com/twiecki/stopsignal/blob/master/src/stop_likelihoods.pyx – that code has most of the liklihood functions in the older pymc and uses some of @twiecki’s cython code. I believe I have most of the setup correct except the integration. When running the pm.sample() I get a “TypeError: not all arguments converted during string formatting” error. I am unsure what that error means and where to begin debugging. I am sure it has to do with my integration code. If i need to make this its own post I will. But I am not sure if the issue is the integration code or my own.

def ExGaussCDF(value, mu, sigma, nu):
    
    p = tt.switch(tt.gt(nu,0.05 * sigma),
                  tt.log(1-(std_cdf((value-mu)/sigma)
                            -std_cdf((value - mu - ((sigma**2)/nu))/sigma)
                            *tt.exp((((mu+((sigma**2)/nu))**2)-(mu**2)-2*value *((sigma**2)/nu))/(2*(sigma**2))))),
                  tt.log((1-(std_cdf(((value-mu)/sigma))))))
    return p

def ExGaussPDF(value, mu, sigma, nu):
    
    lp = tt.switch(tt.gt(nu,  0.05 * sigma),
                       - tt.log(nu) + (mu - value) / nu + 0.5 * (sigma / nu)**2
                       + logpow(std_cdf((value - mu) / sigma - sigma / nu), 1.),
                       - tt.log(sigma * tt.sqrt(2 * np.pi))
                       - 0.5 * ((value - mu) / sigma)**2)
    
    return lp
        
basic_model = pm.Model()

with basic_model:

    # declare all variables for integration
    x = tt.dscalar('x_')
    x.tag.test_value = np.zeros(())
    mu_go_ = tt.dscalar('mu_go_')
    mu_go_.tag.test_value = np.ones(())
    sigma_go_ = tt.dscalar('sigma_go_')
    sigma_go_.tag.test_value = np.ones(())
    nu_go_ = tt.dscalar('nu_go_')
    nu_go_.tag.test_value = np.ones(())
    mu_stop_ = tt.dscalar('mu_stop_')
    mu_stop_.tag.test_value = np.ones(())
    sigma_stop_ = tt.dscalar('sigma_stop_')
    sigma_stop_.tag.test_value = np.ones(())
    nu_stop_ = tt.dscalar('nu_stop_')
    nu_stop_.tag.test_value = np.ones(())
    ssd_ = tt.dscalar('ssd_')
    ssd_.tag.test_value = np.ones(())

    start = tt.dscalar('start')
    start.tag.test_value = 0.001 # 1 ms
    stop = tt.dscalar('stop')
    stop.tag.test_value = 5.  # 5 seconds

    # Make function e^(ExGaussCDF(x,go)) * e^(ExGaussPDF(x-ssd,stop))
    func = tt.exp(tt.switch(tt.gt(nu_go_,0.05 * sigma_go_),
              tt.log(1-(std_cdf((x-mu_go_)/sigma_go_)
                        -std_cdf((x - mu_go_ - ((sigma_go_**2)/nu_go_))/sigma_go_)
                        *tt.exp((((mu_go_+((sigma_go_**2)/nu_go_))**2)-(mu_go_**2)-2*x *((sigma_go_**2)/nu_go_))/(2*(sigma_go_**2))))),
              tt.log((1-(std_cdf(((x-mu_go_)/sigma_go_))))))) * \
              tt.exp(tt.switch(tt.gt(nu_stop_,  0.05 * sigma_stop_),
                   - tt.log(nu_stop_) + (mu_stop_ - (x-ssd_)) / nu_stop_ + 0.5 * (sigma_stop_ / nu_stop_)**2
                   + logpow(std_cdf(((x-ssd_) - mu_stop_) / sigma_stop_ - sigma_stop_ / nu_stop_), 1.),
                   - tt.log(sigma_stop_ * tt.sqrt(2 * np.pi))
                   - 0.5 * (((x-ssd_) - mu_stop_) / sigma_stop_)**2))

    integrate = Integrate(func, x, mu_go_,sigma_go_,nu_go_,mu_stop_,sigma_stop_,nu_stop_,ssd_)

    # Priors for unknown model parameters
    mu_go = pm.Uniform('mu_go',.001,1)
    sigma_go = pm.Uniform('sigma_go',.001,.3)
    nu_go = pm.Uniform('nu_go',.001,.3)
    
    # Priors for unknown stop model parameters
    mu_stop = pm.Uniform('mu_stop',.001,.6)
    sigma_stop = pm.Uniform('sigma_stop',.001,.25)
    nu_stop = pm.Uniform('nu_stop',.001,.25)
    
    def logp_SSRT(value,ssd_rts):
        p = ExGaussPDF(value, mu_go, sigma_go, nu_go) + ExGaussCDF(value-ssd_rts, mu_stop, sigma_stop, nu_stop)
        return tt.sum(p)
    
    def logp_Inhibition(ssd_is):
        p = 0
        #sum over inhibit trials
        for issd in ssd_is:
#             print('SSD value: %.3f' % issd.tag.test_value)
            p += tt.log(integrate(start,stop,mu_go,sigma_go,nu_go,mu_stop,sigma_stop,nu_stop,issd))
        return p
    
    goRTg = pm.ExGaussian('goRTg',mu=mu_go,sigma=sigma_go,nu=nu_go,observed = RTgo)
    
    SSRT = pm.DensityDist('SSRT',logp_SSRT, observed={'value': np.array(SSRT_stop),'ssd_rts': ssd_rt})
    
    Inhibition = pm.DensityDist('Inhib',logp_Inhibition, observed= {'ssd_is': ssd_inhib} )
    
    trace = pm.sample(5000, burn = 1000)

Here is the total error output:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-38-5b2c3f9ef829> in <module>()
     87     Inhibition = pm.DensityDist('Inhib',logp_Inhibition, observed= {'ssd_is': ssd_inhib} )
     88 
---> 89     trace = pm.sample(5000, burn = 1000)

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/pymc3/sampling.py in sample(draws, step, init, n_init, start, trace, chain_idx, chains, njobs, tune, nuts_kwargs, step_kwargs, progressbar, model, random_seed, live_plot, discard_tuned_samples, live_plot_kwargs, compute_convergence_checks, **kwargs)
    371             start_, step = init_nuts(init=init, chains=chains, n_init=n_init,
    372                                      model=model, random_seed=random_seed,
--> 373                                      progressbar=progressbar, **args)
    374             if start is None:
    375                 start = start_

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/pymc3/sampling.py in init_nuts(init, chains, n_init, model, random_seed, progressbar, **kwargs)
   1352             'Unknown initializer: {}.'.format(init))
   1353 
-> 1354     step = pm.NUTS(potential=potential, model=model, **kwargs)
   1355 
   1356     return start, step

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/pymc3/step_methods/hmc/nuts.py in __init__(self, vars, max_treedepth, early_max_treedepth, **kwargs)
    150         `pm.sample` to the desired number of tuning steps.
    151         """
--> 152         super(NUTS, self).__init__(vars, **kwargs)
    153 
    154         self.max_treedepth = max_treedepth

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/pymc3/step_methods/hmc/base_hmc.py in __init__(self, vars, scaling, step_scale, is_cov, model, blocked, potential, integrator, dtype, Emax, target_accept, gamma, k, t0, adapt_step_size, step_rand, **theano_kwargs)
     61 
     62         super(BaseHMC, self).__init__(vars, blocked=blocked, model=model,
---> 63                                       dtype=dtype, **theano_kwargs)
     64 
     65         self.adapt_step_size = adapt_step_size

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/pymc3/step_methods/arraystep.py in __init__(self, vars, model, blocked, dtype, **theano_kwargs)
    213 
    214         self._logp_dlogp_func = model.logp_dlogp_function(
--> 215             vars, dtype=dtype, **theano_kwargs)
    216 
    217     def step(self, point):

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/pymc3/model.py in logp_dlogp_function(self, grad_vars, **kwargs)
    656         varnames = [var.name for var in grad_vars]
    657         extra_vars = [var for var in self.free_RVs if var.name not in varnames]
--> 658         return ValueGradFunction(self.logpt, grad_vars, extra_vars, **kwargs)
    659 
    660     @property

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/pymc3/model.py in __init__(self, cost, grad_vars, extra_vars, dtype, casting, **kwargs)
    397             self._cost, grad_vars, self._ordering.vmap)
    398 
--> 399         grad = tt.grad(self._cost_joined, self._vars_joined)
    400         grad.name = '__grad'
    401 

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in grad(cost, wrt, consider_constant, disconnected_inputs, add_names, known_grads, return_disconnected, null_gradients)
    603 
    604     rval = _populate_grad_dict(var_to_app_to_idx,
--> 605                                grad_dict, wrt, cost_name)
    606 
    607     for i in xrange(len(rval)):

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in _populate_grad_dict(var_to_app_to_idx, grad_dict, wrt, cost_name)
   1369         return grad_dict[var]
   1370 
-> 1371     rval = [access_grad_cache(elem) for elem in wrt]
   1372 
   1373     return rval

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in <listcomp>(.0)
   1369         return grad_dict[var]
   1370 
-> 1371     rval = [access_grad_cache(elem) for elem in wrt]
   1372 
   1373     return rval

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_grad_cache(var)
   1324                     for idx in node_to_idx[node]:
   1325 
-> 1326                         term = access_term_cache(node)[idx]
   1327 
   1328                         if not isinstance(term, gof.Variable):

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_term_cache(node)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in <listcomp>(.0)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_grad_cache(var)
   1324                     for idx in node_to_idx[node]:
   1325 
-> 1326                         term = access_term_cache(node)[idx]
   1327 
   1328                         if not isinstance(term, gof.Variable):

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_term_cache(node)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in <listcomp>(.0)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_grad_cache(var)
   1324                     for idx in node_to_idx[node]:
   1325 
-> 1326                         term = access_term_cache(node)[idx]
   1327 
   1328                         if not isinstance(term, gof.Variable):

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_term_cache(node)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in <listcomp>(.0)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_grad_cache(var)
   1324                     for idx in node_to_idx[node]:
   1325 
-> 1326                         term = access_term_cache(node)[idx]
   1327 
   1328                         if not isinstance(term, gof.Variable):

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_term_cache(node)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in <listcomp>(.0)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_grad_cache(var)
   1324                     for idx in node_to_idx[node]:
   1325 
-> 1326                         term = access_term_cache(node)[idx]
   1327 
   1328                         if not isinstance(term, gof.Variable):

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_term_cache(node)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in <listcomp>(.0)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_grad_cache(var)
   1324                     for idx in node_to_idx[node]:
   1325 
-> 1326                         term = access_term_cache(node)[idx]
   1327 
   1328                         if not isinstance(term, gof.Variable):

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_term_cache(node)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in <listcomp>(.0)
   1019             inputs = node.inputs
   1020 
-> 1021             output_grads = [access_grad_cache(var) for var in node.outputs]
   1022 
   1023             # list of bools indicating if each output is connected to the cost

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_grad_cache(var)
   1324                     for idx in node_to_idx[node]:
   1325 
-> 1326                         term = access_term_cache(node)[idx]
   1327 
   1328                         if not isinstance(term, gof.Variable):

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gradient.py in access_term_cache(node)
   1160 
   1161                 input_grads = node.op.L_op(inputs, node.outputs,
-> 1162                                            new_output_grads)
   1163 
   1164                 if input_grads is None:

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/gof/op.py in L_op(self, inputs, outputs, output_grads)
    709 
    710     def L_op(self, inputs, outputs, output_grads):
--> 711         return self.grad(inputs, output_grads)
    712 
    713     def R_op(self, inputs, eval_points):

<ipython-input-13-44f8ebb860ef> in grad(self, inputs, grads)
     39         dargs = []
     40         for grad in grads:
---> 41             integrate = Integrate(grad, self._var, self._extra_vars)
     42             darg = out * integrate(start, stop, *args)
     43             dargs.append(darg)

<ipython-input-13-44f8ebb860ef> in __init__(self, expr, var, *extra_vars)
      8             [var] + list(extra_vars),
      9             self._expr,
---> 10             on_unused_input='ignore')
     11 
     12     def make_node(self, start, stop, *extra_vars):

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/compile/function.py in function(inputs, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input)
    300         fn = orig_function(inputs, outputs,
    301                            mode=mode,
--> 302                            accept_inplace=accept_inplace, name=name)
    303     else:
    304         # note: pfunc will also call orig_function -- orig_function is

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/compile/function_module.py in orig_function(inputs, outputs, mode, accept_inplace, name, profile, on_unused_input, output_keys)
   1816     mode = theano.compile.mode.get_mode(mode)
   1817 
-> 1818     inputs = list(map(convert_function_input, inputs))
   1819     if outputs is not None:
   1820         if isinstance(outputs, (list, tuple)):

~/Google_Drive/pymc3-dev/lib/python3.6/site-packages/theano/compile/function_module.py in convert_function_input(input)
   1899             else:
   1900                 raise TypeError("Invalid input syntax: %s (check "
-> 1901                                 "documentation or use an In instance)" % orig)
   1902         elif isinstance(input[0], SymbolicInput):
   1903             if len(input) == 1:

TypeError: not all arguments converted during string formatting

What is std_cdf?

That is a normal cdf, its from pymc3.distributions.dist_math

If my [start,stop] is a given set of arrays, that is, only one parameter a is a variable, then how do I write it?

Dear friend,

Thank you very much for your code. I try a simple example to understand the code better. When I use one parameter everything is OK but when I pass two parameters, it doesn’t work. Following you can see my code. Any help would be greatly appreciated.

from scipy.integrate import quad
import theano
import theano.tensor as tt
import numpy as np


class Integrate(theano.Op):
    def __init__(self, expr, var, *extra_vars):
        super().__init__()
        self._expr = expr
        self._var = var
        self._extra_vars = extra_vars
        self._func = theano.function(
            [var] + list(extra_vars),
            self._expr,
            on_unused_input='ignore')
    
    def make_node(self, start, stop, *extra_vars):
        self._extra_vars_node = extra_vars
        assert len(self._extra_vars) == len(extra_vars)
        self._start = start
        self._stop = stop
        vars = [start, stop] + list(extra_vars)
        #vars = list(extra_vars)
        return theano.Apply(self, vars, [tt.dscalar().type()])
    
    def perform(self, node, inputs, out):
        start, stop, *args = inputs
        val = quad(self._func, start, stop, args=tuple(args))[0]
        out[0][0] = np.array(val)
        
    def grad(self, inputs, grads):
        start, stop, *args = inputs
        out, = grads
        replace = dict(zip(self._extra_vars, args))
        
        replace_ = replace.copy()
        replace_[self._var] = start
        dstart = out * theano.clone(-self._expr, replace=replace_)
        
        replace_ = replace.copy()
        replace_[self._var] = stop
        dstop = out * theano.clone(self._expr, replace=replace_)

        grads = tt.grad(self._expr, self._extra_vars) 
        dargs = []
        for grad in grads:
            integrate = Integrate(grad, self._var, self._extra_vars)
            darg = out * integrate(start, stop, *args)
            dargs.append(darg)
   
        return [dstart, dstop] + dargs




y_obs = 8.3


start = theano.shared(1.)
stop = theano.shared(2.)
with pm.Model() as basic_model:
    a = pm.Uniform('a', 1.5, 3.5)
    b = pm.Uniform('b', 4., 6.)

    t = tt.dscalar('t')
    t.tag.test_value = np.zeros(())
   
    func = t**a + b
    integrate = it.Integrate(func,t,a,b)
    mu = integrate(start,stop,a,b)
    

    y = pm.Normal('y', mu=mu, sd=0.4, observed=y_obs)
    trace = pm.sample(1500,tune=500, cores=2,chains=2)

There is a small mistake in the gradient part:

...
for grad in grads:
    integrate = Integrate(grad, self._var, *self._extra_vars)
    ...

Also, the correct usage should be:

y_obs = 8.3

start = theano.shared(1.)
stop = theano.shared(2.)
with pm.Model() as basic_model:
    a = pm.Uniform('a', 1.5, 3.5)
    b = pm.Uniform('b', 4., 6.)

    # Define the function to integrate in plain theano
    t = tt.dscalar('t')
    t.tag.test_value = np.zeros(())
    a_ = tt.dscalar('a_')
    a_.tag.test_value = np.ones(())*2.
    b_ = tt.dscalar('b_')
    b_.tag.test_value = np.ones(())*5.
    func = t**a_ + b_
    integrate = Integrate(func, t, a_, b_)

    # Now we plug in the values from the model.
    # The `a_` and `b_` from above corresponds to the `a` and `b` here.
    mu = integrate(start, stop, a, b)
    y = pm.Normal('y', mu=mu, sd=0.4, observed=y_obs)
    trace = pm.sample(1500, tune=500, cores=2, chains=2)

Thanks for your help. I updated my code according to your point and now the code is correct but I face another problem. When I use NUTS and SMC step methods, the code works but when I use the Metropolis or Hamiltonian step methods, it doesn’t work. In fact, in the case of the Metropolis and Hamiltonian, the code only samples from the prior. It seems strange. Any help? Following I attached my codes.

integration.py (1.6 KB)

test2.py (1.1 KB)

Metropolis is likely not suitable for these kind of problem (as it is inefficient), and to make HMC to work you need to hand tune a lot of parameters.

Perhaps this is a really dumb question …

Would the custom Theano Op Integrate work when either start or stop are infinite? I wonder because it called scipy.intergrate.quad and this function does calculate infinite integrals.

Edit: I get the error TypeError: Cannot convert Type TensorType(float32, scalar) (of Variable a) into Type TensorType(float64, scalar). You can try to manually convert a into a TensorType(float64, scalar). when trying to run this.

Hi

This post has been very useful to me and I was able to actually do the integrals that I need. However, I get an error when actually running a model calculation. It is the same error I get when I run @AH5719 's scripts:

Traceback (most recent call last):
  File "/home/lorenzo/.local/lib/python3.8/site-packages/pymc3/theanof.py", line 80, in floatX
    return X.astype(theano.config.floatX)
AttributeError: 'list' object has no attribute 'astype'

During handling of the above exception, another exception occurred:

TypeError: float() argument must be a string or a number, not 'TensorVariable'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test2.py", line 44, in <module>
    y = pm.Normal('y', mu=mu, sd=0.1, observed=y_obs)
  File "/home/lorenzo/.local/lib/python3.8/site-packages/pymc3/distributions/distribution.py", line 82, in __new__
    dist = cls.dist(*args, **kwargs)
  File "/home/lorenzo/.local/lib/python3.8/site-packages/pymc3/distributions/distribution.py", line 91, in dist
    dist.__init__(*args, **kwargs)
  File "/home/lorenzo/.local/lib/python3.8/site-packages/pymc3/distributions/continuous.py", line 488, in __init__
    self.mean = self.median = self.mode = self.mu = mu = tt.as_tensor_variable(floatX(mu))
  File "/home/lorenzo/.local/lib/python3.8/site-packages/pymc3/theanof.py", line 83, in floatX
    return np.asarray(X, dtype=theano.config.floatX)
  File "/opt/conda/lib/python3.8/site-packages/numpy/core/_asarray.py", line 83, in asarray
    return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.

So, it seems that pm.Normal does not accept a list of Integrate class objects. I’m running the code with python 3.8.5, pymc3 3.9.2 and theano 1.0.5. Does anybody have an idea how to solve this? Thanks a lot!

I found a work around: just cast the list of integrals into a tensor variable, i.e. modify the following line in the test2.py file in the previous post to

y = pm.Normal('y', mu=tt.as_tensor_variable(mu), sd=0.1, observed=y_obs)

Now, everything runs well.

2 Likes

Thank you for all the help, this was very useful. I am trying to generalize the function to the case where the further parameters can be vectors or matrices, e.g. for the following case:

y_obs = 8.3
start = aesara.shared(1.)
stop = aesara.shared(2.)

with pm.Model() as basic_model:
    
    a = pm.Uniform('a', 1.5, 3.5)
    b = pm.Uniform(
        'b', 
        4., 6., 
        shape=(3)
    )

    # Define the function to integrate in plain pytensor
    t = at.dscalar('t')
    t.tag.test_value = np.zeros(())
    a_ = at.dscalar('a_')
    a_.tag.test_value = np.ones(())*2.
    
    b_ = at.dvector('b_')
    b_.tag.test_value = np.ones((3))*5.

    func = t**a_ + b_.sum()
    integrate = Integrate(
        # Function we're integrating
        func, 
        # variable we're integrating
        t, 
        # other variables
        a_, b_
    )

    # Now we plug in the values from the model.
    # The `a_` and `b_` from above corresponds 
    # to the `a` and `b` here.
    mu = integrate(
        start, 
        stop, 
        a, 
        b
    )
    y = pm.Normal(
        'y', 
        mu=mu, 
        sigma=0.4, 
        observed=y_obs
    )
    
    trace = pm.sample(
        1500, 
        tune=500, 
        cores=2, 
        chains=2,
        return_inferencedata=True
    )

When I run the above with the Integrate Op as defined above in this thread, I get the following error:

ValueError: Integrate.grad returned a term with 0 dimensions, but 1 are required.

I assume we should specify somewhere that the gradient of b is 1-d now, but where should that happen? Thanks!

Hi folks,

I adapted the function to work with (either scalar or vector valued) functions f: R^{d_1}_1 \times ... \times R^{d_n}_n \rightarrow R^k, such that the argument wrt which we’re integrating is a scalar and the other arguments are either scalars or vectors.

import numpy as np
import pymc as pm
import pytensor as aesara
import pytensor.tensor as at

# Needed for integration
from scipy.integrate import quad, quad_vec
from pytensor.graph.op import Op
from pytensor.graph.basic import Apply
from pytensor import clone_replace

class Integrate(Op):
    
    # Class to perform integration of a scalar variable
    # on a bounded interval
    
    # Adapted from:
    # https://discourse.pymc.io/t/custom-theano-op-to-do-numerical-integration/734/12
    # With some modifications!
    
    def __init__(self, expr, var, *extra_vars):
        """
        Parameters
        ----------
        expr: Aesara Variable
            The expression encoding the output
        var: Aesara Variable
            The input variable
        """
        super().__init__()
        
        # function we're integrating
        self._expr = expr
        
        # input var we're integrating over
        self._var = var
        
        # other variables
        self._extra_vars = extra_vars
        
        # transform expression into callable function
        self._func = aesara.function(
            # a list with all the inputs
            [var] + list(extra_vars),
            # output
            self._expr,
            on_unused_input='ignore'
        )
    
    def make_node(self, start, stop, *extra_vars):
        
        self._extra_vars_node = extra_vars
        
        # make sure that the same number of extra variables
        # are passed here as were specified when defining the Op
        assert len(self._extra_vars) == len(extra_vars)
        
        # Define the bounds of integration
        self._start = start
        self._stop = stop
                
        # return an Apply instance with the input and output Variable
        return Apply(
            # op: The operation that produces `outputs` given `inputs`.
            op=self, 
            # inputs: The arguments of the expression modeled by the `Apply` node.
            inputs=[start, stop] + list(extra_vars), 
            # outputs: The outputs of the expression modeled by the `Apply` node.
            # NOTE: This is a scalar if self._expr is a scalar,
            # and a vector if self._expr is a vector. Etc.
            outputs=[self._expr.type()]
        )
    
    def perform(self, node, inputs, out):
        """
        Out is the output storage.
        Inputs are passed by value.
        A single output is returned indirectly 
        as the first element of single-element lists (out)
        
        NOTE: There's a restriction, namely the variable to integrate
        has to be a scalar, even though the other variables can vector.
        
        Parameters
        ----------
        node: Apply node
            The output of make_node
        inputs: List of data
            The data can be operated on with numpy
        out: List
            output_storage is a list of storage cells where the output 
            is to be stored. There is one storage cell for each output of the Op. 
            The data put in output_storage must match the type of the symbolic output. 
        """

        # Runs the computation in python
        start, stop, *args = inputs
                
        if self._expr.ndim == 0:
            val = quad(
                self._func, 
                start, 
                stop, 
                args=tuple(args)
            )[0]
        elif self._expr.ndim == 1:
            # if the function is vector-valued
            # (e.g., the gradient of a vector),
            # use quad_vec
            val = quad_vec(
                self._func,
                start,
                stop,
                args=tuple(args)
            )[0]
        else:
            shape = self._func(
                start,
                *args
            ).shape
            
            def helpfunc(*args):
                return self._func(*args).flatten()
            
            val = quad_vec(
                helpfunc,
                start,
                stop,
                args=tuple(args)
            )[0].reshape(shape)
        
        # in-place modification of "out".
        # out is a single-element list
        out[0][0] = np.array(val)
        
    def grad(self, inputs, grads):
        """
        NOTE: This function does not calculate the gradient
        but rather implements part of the chain rule,
        i.e. multiplies the grads by the gradient wrt to the cost
        See https://aesara.readthedocs.io/en/latest/extending/op.html
        for an explanation
        
        Inputs in this case contains: 
        [lower integration bound, upper integration bound, ...other variables of function]
        """
        
        # unpack the input
        start, stop, *args = inputs
        out, = grads
        
        # dictionary with the extra variables as keys
        # and the extra variables in "inputs" as values
        replace = dict(zip(
            self._extra_vars, 
            args
        ))
        
        # The derivative of integral wrt to the upper limit of integration
        # is just the value of the function at that limit
        # (for lower limit, it's minus the function)
        # See e.g.,
        # https://math.stackexchange.com/questions/984111/differentiating-with-respect-to-the-limit-of-integration
        replace_ = replace.copy()
        replace_[self._var] = start
        dstart = out * clone_replace(
            # Clone a graph and replace subgraphs within it. 
            # It returns a copy of the initial subgraph with the corresponding
            # substitutions.
            -self._expr, 
            # Dictionary describing which subgraphs should be replaced by what.
            replace=replace_
        )
        
        replace_ = replace.copy()
        replace_[self._var] = stop
        dstop = out * clone_replace(
            self._expr, 
            replace=replace_
        )

        # calculate the symbolic gradient of self._expr
        # wrt each extra variable.
        # This can be done because they're symbolic aesara variables!
        # This corresponds to the gradient of the expression
        # *inside* the integral (the inner part of Leibniz'
        # integral rule)
        grads = at.jacobian(
            # cost
            self._expr, 
            # wrt
            self._extra_vars
        )
        
        dargs = []
        # loop over the gradients of the extra vars
        for grad in grads:
                        
            # define an Apply node
            # for that gradient
            integrate = Integrate(
                grad, 
                # variable we're integrating over
                self._var, 
                *self._extra_vars
            )
            
            # Apply Leibniz' integral rule:
            # call integrate, which evaluates
            # the integral of the gradient.
            # And then multiply with previous gradient
            # that was passed in the input.
            # NOTE: This is not actually doing the operation,
            # but rather calling make_node, which *creates the node*
            # that does the operation
            darg = at.dot(
                integrate(
                    start, stop, 
                    *args
                ).T, 
                out  
            )
            
            dargs.append(darg)
        
        # return a list with one Variable for each input in inputs
        return [dstart, dstop] + dargs

I changed a few things:

  • Apply in the return value of make_node now gets outputs=[self._expr.type()], since the expression can be any shape.
  • perform adapts depending on the shape of the output to self._func. If the output is a scalar, it uses quad. If it is a vector, it uses quad_vec. Otherwise, it defines a helper function that flattens the output of self._func, then runs quad, and finally reshapes it back into the original shape.
  • The previous point is useful because in grad() we integrate, with respect to self._var, the Jacobian of self_expr with respect to each of the extra variables. Since some of the extra variables can be vectors and self_var can be a vector, the Jacobian can be up to 2 dimensional.

I tested the code with a modified version of the example above, where one of the inputs (b) is now a vector, and the function to integrate, i.e., f(a, b, t) = t^a + b, is vector valued. We are calculating \int_1^2 f(a, b, t) dt:

### Example of usage

y_obs = np.array([8.3, 8.0, 7.8])
start = aesara.shared(1.)
stop = aesara.shared(2.)

with pm.Model() as basic_model:
    
    a = pm.Uniform('a', 1.5, 3.5)
    b = pm.Uniform(
        'b', 
        4., 6., 
        shape=(3)
    )

    # Define the function to integrate in plain pytensor
    t = at.dscalar('t')
    t.tag.test_value = np.zeros(())
    
    a_ = at.dscalar('a_')
    a_.tag.test_value = np.ones(())*2.
    
    b_ = at.dvector('b_')
    b_.tag.test_value = np.ones((3))*5.
    
    func = t**a_ + b_
    integrate = Integrate(
        # Function we're integrating
        func, 
        # variable we're integrating
        t, 
        # other variables
        a_, b_
    )

    # Now we plug in the values from the model.
    # The `a_` and `b_` from above corresponds 
    # to the `a` and `b` here.
    mu = integrate(
        start, 
        stop, 
        a, 
        b
    )
    y = pm.Normal(
        'y', 
        mu=mu, 
        sigma=0.4, 
        observed=y_obs
    )

with basic_model:
    trace = pm.sample(
        1500, 
        tune=500, 
        cores=2, 
        chains=2,
        return_inferencedata=True
    )

Any feedback would be greatly appreciated - there might be some mistakes. I have also used this for a more complex model and it’s extremely slow, so it would be awesome to find ways of making it faster. Thanks!