Implementing something like tt.trapz() for numerical integration over a set domain?

I’m trying to create a custom likelihood function similar to the one described here: P(d|m) \propto exp \frac{-||d-f(m)||^2}{\sigma^2} where again d is observed data and m are parameters.

In my case however, f involves a numerical integration, something like f(\lambda, m) \propto \int{\eta(\lambda) h(\lambda, m) d\lambda}, where \lambda is the numerical domain over which I need to evaluate f. The function h(\lambda, m) is analytically defined, but \eta(\lambda) is some empirically-measured function defined over discrete values of \lambda. Am I wrong in thinking this rules out using the popular solution found here? If that won’t work, is there any way to do this integration?

I am new to the world of Bayesian data analysis, PyMC3, and Theano, but I think if Theano had a tt.trapz() function like numpy’s np.trapz(), I might be able to make this work, using a strategy like this one.

Any advice is greatly appreciated!

We have some post using numerical integration: Custom theano Op to do numerical integration you can take a look at try the theanoOps from that post.

Thanks for replying!

I maybe could have done a better job of calling out the links, but I already referenced the post you mentioned, which I’m fairly well acquainted with at this point. I think my situation might be different enough to make previous approaches ineffective.

I think I’m looking for something like trapezoidal integration rather than what’s been implemented previously, which takes an analytical function and limits of integration.

Oh right. I think you might be able to change the quad integrator in the post with scipy.integrate.trapz - maybe you can give this a try?

Thanks @junpenglao. This is probably not as simple as just replacing “quad” with “trapz,” right? I assume I will also have to edit the whole Integrate class, including the grad() function. The role of Theano Ops and their grad() functions remain a bit mysterious to me. I can spend more time with the documentation, but if you have any advice on this that would be very helpful.

Another option that I’ve been looking at is using a “black-box” likelihood function (link). It looks like the code from this blog post is very close to being a one-size-fits-most solution, but I might have to switch the gradient() function over to scipy avoid using cython. Are there any drawbacks to using the “black-box” method?

Thanks again!

It should work as the gradient of a integrate class is just the original function itself. So as long as it is some integrate approximation, you should be able to just replace the approximation.

Maybe too many years late. But I run into the same problem and followed @junpenglao 's advice and tried adapting numpy.trapz. I simply replaced np by at (ie. numpy by pytensor) and I commented out the exception rule (just for the sake of this example but may break things up in more complex uses). Results are equivalent, but would it make sense for every situation?:

import numpy as np

import pytensor.tensor as at

x = np.arange(100)

np.trapz(x)
Out[4]: 4900.5

def at_trapz(y, x=None, dx=1.0, axis=-1):
    if x is None:
        d = dx
    else:
        #x = asanyarray(x)
        if x.ndim == 1:
            d = at.diff(x)
            # reshape to correct shape
            shape = [1]*y.ndim
            shape[axis] = d.shape[0]
            d = d.reshape(shape)
        else:
            d = at.diff(x, axis=axis)
    nd = y.ndim
    slice1 = [slice(None)]*nd
    slice2 = [slice(None)]*nd
    slice1[axis] = slice(1, None)
    slice2[axis] = slice(None, -1)
    #try:
    ret = at.sum((d * (y[tuple(slice1)] + y[tuple(slice2)]) / 2.0), axis)
    # except ValueError:
    #     # Operations didn't work, cast to ndarray
    #     d = np.asarray(d)
    #     y = np.asarray(y)
    #     ret = add.reduce(d * (y[tuple(slice1)]+y[tuple(slice2)])/2.0, axis)
    return ret
    

at_trapz(x)
Out[6]: Sum{axis=[0], acc_dtype=float64}.0

at_trapz(x).eval()
Out[7]: array(4900.5)