Using PyMC3 with external functions

MNGuenther · December 8, 2020, 8:32pm

I am trying to use PyMC3 to replace emcee as my MCMC sampler of choice. However, I depend on external functions to draw the model curves, which I import from existing packages that are not compatible with Theano’s style. How can I still use PyMC3 as a sampler? For example:

#::: Option 1: How the external fct currently looks
def external_fct_1(x, xmid, depth):
    y = np.zeros_like(x)
    y[ (x>xmid-0.1) & (x<xmid+0.1) ] = -depth
    return y


#::: Option 2: Rewriting the external fct in Theano style
def external_fct_2(x, xmid, depth):
    y = np.zeros_like(x)
    y[ (tt.ge(x,xmid-0.1) & tt.le(x,xmid+0.1)) ] = -depth
    return y


#::: Option 3: Using the as_op decorator
@as_op(itypes=[tt.dvector, tt.dscalar, tt.dscalar], otypes=[tt.dvector])
def external_fct_3(x, xmid, depth):
    y = np.zeros_like(x)
    y[ (x>xmid-0.1) & (x<xmid+0.1) ] = -depth
    return y


def run():

    #::: Simulated Data
    x = np.linspace(0,0.6,1000)
    xmid = 0.3
    depth = 0.01
    sigma = 0.003
    y = external_fct_1(x, xmid, depth) + np.random.randn(len(x)) * sigma
    # y = external_fct_2(x, xmid, depth) + np.random.randn(len(x)) * sigma
    # y = external_fct_3(x, xmid, depth) + np.random.randn(len(x)) * sigma
    fig, ax = plt.subplots()
    ax.plot(x, y, 'b.')
    
    
    #::: Model Definition and Fitting
    model = pm.Model()
    
    with model:
    
        # Priors for unknown model parameters
        xmid = pm.Uniform("xmid", lower=0.28, upper=0.32)
        depth = pm.Uniform("depth", lower=0, upper=0.05)
        sigma = pm.Uniform("sigma", lower=0.001, upper=0.005)
    
        # Expected value of outcome
        mu = external_fct_1(x, xmid, depth)
        # mu = external_fct_2(x, xmid, depth)
        # mu = external_fct_3(x, xmid, depth)
    
        # Likelihood (sampling distribution) of observations
        y_obs = pm.Normal("y_obs", mu=mu, sigma=sigma, observed=y)
        
        # draw 500 posterior samples
        trace = pm.sample(500, return_inferencedata=False)

    az.plot_trace(trace)
    print( az.summary(trace, round_to=2) )

The simulated data set looks like this:

If I use external_fct_1 I get the following error:
```
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
```
This one is clear/expected because I know Theano cannot handle the “<” and “>”
operators.

If I use external_fct_2 I get the following error:

MissingInputError: Input 0 of the graph (indices start from 0), used to compute sigmoid(xmid_interval__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.

Why does this happen and how can it be solved?
But either way, it would require rewriting the entirety of external packages in Theano style, which is unfeasible.

If I use external_fct_3 I get the following error:
```
 AttributeError: 'numpy.ndarray' object has no attribute 'type'
```
Why does this happen and how can it be solved?

Finally, what is the recommended way to wrap PyMC3 as an MCMC sampler around any external function?

JJR4 · December 9, 2020, 9:39pm

Hey @MNGuenther, I also depend on many external functions (sometimes quite complicated like ode solvers etc.) to produce my modeled data. Currently the solution I have been using is to follow the prescription in this post: https://docs.pymc.io/notebooks/blackbox_external_likelihood.html, using a custom likelihood function, since it is in most cases non-trivial to write a theano version of the external functions in my case. I do not make use of the cython aspect of the code currently but following the same principle has worked well for me, though I am sure it is likely not the most efficient method since many of the computations happen outside of theano so likely the entire model is not very optimized (?)

Either way it is a solution, but I am very interested to hear if there is a more appropriate solution.

xgomezdans · December 11, 2020, 1:03pm

I also rely on external functions a lot. In some cases, they’re not even Python, but rely on calling some old compiled Fortran code. I did find some issues in the example @JJR4 provided, but made it to work. A self contained examples is here.

I saw a post a few weeks’ from @_eigenfoo showing he had extracted the samplers from pymc. I now see his project is archived/discontinued. It’d be nice to have access to the population samplers etc without the rest of pymc for these kind of applications.

_eigenfoo · December 12, 2020, 4:16pm

Hey @xgomezdans, thanks for tagging! To speak plainly, I archived the repository because it was becoming hard for me to maintain (i.e. maintain even commit-for-commit with pymc3/master), especially with so few users (in fact, I think almost none at all!).

I’d be happy to unarchive and continue maintaining the project, if I was convinced that there were a small but healthy community of users who are earnestly using the software.

Unfortunately, I don’t have the bandwidth to maintain more than just the NUTS/HMC samplers - unless someone would like to contribute to isolate the population samplers, I don’t anticipate littlemcmc offering them anytime soon.

Topic		Replies	Views
Using pymc3 NUTS sampler for an "external model" having its derivatives with respect to its parameters Questions	4	2364	March 30, 2018
Using theano scan with pymc3 Questions	3	874	December 4, 2020
Pymc3 likelihood math with non-theano function Questions theano	4	1075	September 25, 2017
Connecting PyMC3 to external code - help with understanding Theano custom Ops Questions theano	20	4217	December 9, 2021
Error in the sampling process, 'sd' not taken in account in Metropolis.py Questions	8	1446	September 29, 2017

Using PyMC3 with external functions

Related topics