Celery & pymc3 not happy together


#1

I’m developing a project which will potentially generate lots of pymc3 fitting tasks which I want setup as celery tasks to queue and handle. Problem is as soon as I import pymc3 into a celery worker code the worker exits.

This is even without any actual pymc3 code included:

[2017-08-06 12:33:49,948: ERROR/MainProcess] Process 'Worker-4' pid:7952 exited with 'exitcode 1'
[2017-08-06 12:33:49,948: ERROR/MainProcess] Process 'Worker-3' pid:15760 exited with 'exitcode 1'
[2017-08-06 12:33:49,948: ERROR/MainProcess] Process 'Worker-1' pid:15592 exited with 'exitcode 1'
[2017-08-06 12:33:50,765: ERROR/MainProcess] Process 'Worker-2' pid:13536 exited with 'exitcode 1'

If I comment out the import of pymc3 the worker runs fine:

import pandas as pd
#import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
import patsy as pt
import seaborn as sns
from theano import shared

Any ideas?

Found this which seems to be the same/similar problem but no resolution on this one:

And minimal code to reproduce (saved as testcelery.py):

from celery import Celery

app = Celery('tasks', broker='amqp://')
import pymc3 as pm


@app.task()
def add(x, y):
    return x + y

if __name__ == '__main__':
    app.start()

Launch celery worker with celery worker -A testcelery.app --loglevel=info


#2

Could you try importing theano in one of the workers instead of pymc3? (maybe import theano, theano.tensor, theano.tensor.nlinalg)
Can you get a backtrace or a log the workers somehow?


#3

This works:

from celery import Celery

app = Celery('tasks', broker='amqp://')
#import pymc3 as pm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import patsy as pt
import seaborn as sns
from theano import shared
import theano, theano.tensor, theano.tensor.nlinalg

@app.task()
def add(x, y):
    return x + y

if __name__ == '__main__':
    app.start()

Will see about backtrace. If I add pymc3 it exits.


#4

Well on other code errors the workers dump backtrace on the terminal. So not sure what it causing it not to do so now.


#5

Could you maybe open an issue on the celery bug tracker on github? Without input from the celery developers I’m a bit lost here. Feel free to CC me in the issue (@aseyboldt)


#6

I posted on their mailing list:
https://groups.google.com/forum/#!topic/celery-users/OH2Ztezei6Y


#7

I think I’ve solved my own issue by moving the import into the task. This seems starts up without exiting. I saw this treatment of theano somewhere in the context of using celery with theano.

from celery import Celery

app = Celery('tasks', broker='amqp://')

import pandas as pd
import numpy as np
import patsy as pt
import pyodbc
from flask import Flask, abort, request, jsonify, g, url_for
from datetime import datetime,date,timedelta

#import matplotlib.pyplot as plt
#import seaborn as sns
#from theano import shared
#import theano, theano.tensor, theano.tensor.nlinalg

@app.task()
def add(x, y):
    import pymc3 as pm
    return x + y

if __name__ == '__main__':
    app.start()

#8

Great, thanks for following up. I’d be quite curious about your experience productizing PyMC3 models (if that’s what you’re doing).


#9

Still busy. Lot’s of options here productionise, productise, productionize or productize as an aside :smiley: