I and my colleagues have written a methodology paper about Bayesian methods for analysis of plants development. I have developed a Dash application (with pymc3 backend for analysis) for interactive analysis by the means of our Bayesian methodology (target audience are not mathematicians).
My Dash app with pymc3 in the backend works fine on localhost but is unusable when running on the Ubuntu 18.10 server. When initialized it raises a Warning:
Apr 04 09:09:31 bayes4plants uwsgi: *** Operational MODE: preforking ***
Apr 04 09:09:33 bayes4plants uwsgi: /home/bayes/bayesiansurvivalanalysis-webapp/venv/lib/python3.6/site-packages/theano/configdefaults.py:560: UserWarning:
Apr 04 09:09:33 bayes4plants uwsgi: DeprecationWarning: there is no c++ compiler.This is deprecated and with Theano 0.11 a c++ compiler will be mandatory
Apr 04 09:09:33 bayes4plants uwsgi: WARNING (theano.configdefaults): g++ not detected ! Theano will be unable to execute optimized C-implementations (for both CPU and GPU) and will default to Python implementations. Performance will be severely degraded. To remove t
Apr 04 09:09:33 bayes4plants uwsgi: WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Apr 04 09:09:33 bayes4plants uwsgi: unable to load configuration from from multiprocessing.semaphore_tracker import main;main(14)
Apr 04 09:09:34 bayes4plants uwsgi: WSGI app 0 (mountpoint='') ready in 3 seconds on interpreter 0x55f25afeb470 pid: 1861 (default app)
When I wait long enough the sampling is done, but the computation time is not feasable at all (and much longer then when run localy).
So far I tried:
- Ported my app from Heroku to DigitalOcen so I can manage the server more directly.
- Installed g++ and other libraries according to the instructions in the Theano documentation (with no effect).
- Pointed to the exact place of the g++ executable on my server (
/usr/bin/g++) in the
~/.theanorc. This followed to other error, so I don’t use this now.
Right now I am desperate with no other idea to try. Can you help me?
This is honestly not so much a Pymc3 issue, as a Dash deployment issue. Have you tried asking this on the Dash forums, or better yet, made a clear bug report on Dash’s github page (I’ve had more success getting them to respond to queries that way)?
Right away, I would expect that the hosting service does not offer as much computational power as does your local machine, and that’s why you’re experiencing this slowdown. Can you confirm what the host is running?
Honestly, I must disagree. The error I posted quite obviously says the problem is in the Theano (not Dash).
Sure the server computational power is one of the factors, but definitely not the main. The error says I should expect higher computational time because Theano didn’t find the g++.
When you say Dash, do you mean this?
Let’s use a simple model to know the problem. I see that
g++ was not installed, but you said you installed it. The first step is to install
theano first (1.0.4) and then
pymc3 (3.6). Run this:
import scipy.stats as stats
import pymc3 as pm
import matplotlib.pyplot as plt
true_sigma = 1.2
true_mu = 5.7
noise = stats.norm.rvs(size=1000, loc=0, scale=0.3)
true_y = stats.norm.rvs(size=1000, loc=true_mu, scale=true_sigma)
y_obs = true_y + noise
with pm.Model() as model:
sigma = pm.HalfCauchy('sigma', beta=5)
mu = pm.Uniform('mu', lower=0, upper=10)
post = pm.Normal('post', mu=mu, sd=sigma, observed=y_obs)
trace = pm.sample(draws=2000, tune=2000)
My laptop takes 5 s. Tell us what you see.
My response was a bit confused and unclear, so I apologize. What I meant more is that this is a server/framework error, rather than a Pymc3 one (specifically, Theano is not finding g++ as you say).
The only advice I know I can give is how I personally setup Theano on Google’s Compute Engine on Ubuntu 14.04 images (I’ve had similar issues myself very, very often): use
anaconda in installing Intel’s Python distribution (which already has everything decently optimized out of the box),
conda install Theano, and lastly run
sudo apt install g++ for g++.
Worse comes to worst, you can try that to setup your environment first.
I combined your suggestions, done some testing and solved the problem.
The problem was that the local environment somehow wasn’t able to find installed
g++. I have configured the service so it uses global version of the packages and everything works smoothly.
The time still isn’t perfect, but is as fast as on the laptop.
Thank you very much.
Once we finish the paper and website, I’ll post the link. Work with pymc3 is amazing and I am glad it allows us to build interesting models very fast.
Just for information. By Dash I mean Plotly Dash.
as I promised I share with you first publiv version of our webapp using PyMC3.
It is still under construction as is our paper. Maths and the motivation behind the model will be more clear once the paper is out. We welcome any suggestions.
Thank you once more, PyMC3!