Gaining speed in sampling an ODE

fabianrost84 · November 28, 2019, 3:33pm

I was really pleased to see DifferentialEquations in pymc3. So, I started working on one problem of mine, where I could need this. Because currently, I do ML estimates, but going Bayesian would be great. I’m already very greatfull for the support I got from @dpananos and @michaelosthege (Return value for 2n-dimensional ODE system). Thanks!

My main problem now is, that pymc3 is much too slow when sampling. Here is an example notebook:

gist.github.com

https://gist.github.com/fabianrost84/f6ffe42fed6f4a8165e76e15d98edcee

ode.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# PYMC3 for an ODE system"
   ]
  },
  {

This file has been truncated. show original

It is quite lengthy, as it is the actual task I am working on. I simulate some data and perform parameter recovery. The original data I work on has the same structure.

I infer parameters for two different models. The first model (A) is an n-dimensional system of ODEs and there are only 2 free parameters. This model seems still doable and I get reasonable results. Still, it is quite slow. And posterior analysis, things like traceplot are very slow as well.

However, the second model (D) which is a 2*n-dimensional ODE system and has 5 free parameters is way too slow to finish in reasonable time on my machine. But this is the model I work on and ML estimates seem to be OK (not shown in the notebook).

I already hacked the DifferentialEquation class to use solve_ivp which is much faster than odeint in my case (this also inspired Solve_ivp for Differential Equation).

Is there anything else to speed up computations? Any advise would be very welcome.

dpananos · November 28, 2019, 5:37pm

I know this seems a bit silly to be saying on PyMC3’s discourse, but have you maybe thought about using Stan for their ODE capability? Their implementation absolutely crushes mine, and so if speed is a concern, that would be my recomendation. DifferentialEquation is still new functionality and there is a lot of work to be done.

MatthewH · December 11, 2019, 3:26pm

I’d also like to add my thanks for starting the DifferentialEquation work, its something that could be very useful for my research. Unfortunately I’ve also found that the speed is a limiting factor, my code basically grinds to a halt if I try anything too complex.

I’ve done some profiling and it would seem that memory allocations are currently a bottle neck, in 1 minute of sampling roughly 20 million allocations are made for a total of 1.25 GB of memory, almost all of this is very quickly de-allocated. Digging a bit further shows that the vast majority of these allocations are below odeint call_odeint_user_function in the stack and come from theano ops. My guess (and it is only a guess) is that this is due to the use of theano in utils.augment_system of the ode code. Whilst it’s clearly very elegant from the programming perspective to get theano to calculate the jacobian, I think the result is that theano is building up and destroying its memory framework for every call to the user function, producing the rather extreme memory allocation use. Let me know if there’s any other data might be useful.

Gon_F · December 13, 2019, 4:33am

It seems like you profiled at least one source of the bottlenecks quite well. I am still relatively new to optimizing complex mathematical operations, so what would be the obvious way to potentially work on this particular bottleneck? Re-using the old memory?

I am genuinely wondering here. ODE functionality in Pymc3 is already quite amazing!

michaelosthege · December 13, 2019, 12:48pm

You’re right: the context switch in the iterations of the ODE solver is a huge problem.

There are a few things that are not too difficult and could lead to some acceleration:

numba-compiled functions of sympy-derived augment_system (instead of theano)
adjusting absolute/relative tolerances for odeint
option to turn off sensitivities & using a gradient-free sampler (only small & easy models)

And then there’s a the optimal solution that is much faster, but for which we need more help to implement:

make cross-OS conda-installable package that wraps sundials
sympy analysis to get augment_system
numba-compile augment_system generated code in a way that it acts directly on sundials data types

The optimal strategy actually avoids all the context switches and also avoids copying data around all the time. @aseyboldt has a proof of concept already, but point 1. as well as a clean object-oriented implementation are still under construction.
Based on Adrians work, I managed to build an entry point for the sympy-based analysis of a user-provided ODE system.
See here:

Topic		Replies	Views
Problems with bayesian ODEs parameter estimation Questions	2	1163	April 9, 2020
Plotting super slow with ODE model Questions	0	283	January 4, 2021
Fast ODE solver for PyMC3 Questions	3	781	October 5, 2021
Pymc inference for a high dimensional ODE system version agnostic modeling	4	80	March 15, 2025
ODE with matrix calculation v3	3	419	April 25, 2022

Gaining speed in sampling an ODE

Related topics