Slow performance using Bambi compared to Example, please help!

Goodmanngl · April 25, 2022, 6:38pm

I’m getting to grips with pymc and Bayesian modeling through running some of the Bambi Jupiter notebook worked examples…

I seem to be having real performance issues just fitting the Bambi glm models in the example. One of the models which seems to take 7 seconds to fit in the example takes over an hour on my VM, so something must be wrong! (And not with the model set up or code as it was ran directly from the example with the same data)

This is the example I’m running:

github.com

bambinos/bambi/blob/main/docs/notebooks/ESCS_multiple_regression.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Multiple linear regression"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import arviz as az\n",
    "import bambi as bmb\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import pandas as pd\n",

This file has been truncated. show original

I’m using an Azure VM (Standard E4ads v5 (4 vcpus, 32 GiB memory))

Is this a matter of needing a better VM? (More vCPUs?) or is there something else that might have gone wrong in setup to cause such slow sampling?

Any help or suggestions on how to troubleshoot this and speed things up would be appreciated!

tcapretto · April 26, 2022, 2:21am

Hey there!

Could you share which Bambi version you’re using? I would suggest installing from the development version on Github.

Goodmanngl · April 26, 2022, 7:47pm

Hey! Thanks for the quick reply. I’m using bambi v0.7.1? Are there some known issues with any versions?

I’ll give installing the dev version from GitHub a try as soon as possible and provide an update.

tcapretto · April 26, 2022, 11:58pm

There’s only one known issue with predictions, but this is not connected to your problem.

In the meantime, you could also compare the results in the example notebook and the results you obtain in your VM. If their differ a lot, it may indicate a problem with the model specification (which shouldn’t be the case AFAIK)

DanhPhan · April 27, 2022, 7:47am

Hi, just check bambi (version 0.6.3 and 0.7.1) on Ms Azure notebook, and they run smoothly.

I use pip to install these specific verions:
pip install scipy==1.7.3 arviz==0.11.4 bambi==0.6.3 watermark
or
pip install scipy==1.7.3 arviz==0.11.4 bambi==0.7.1 watermark

The model fitting is around 30s-40s (less then 1 minute) in a small VM (Standard_D11_v2, 2 cores, 14 GB RAM, 100 GB disk).

Here is the watermark for the environment.

Last updated: Wed Apr 27 2022

Python implementation: CPython
Python version : 3.8.1
IPython version : 7.30.1

numpy : 1.18.5
arviz : 0.11.4
logging : 0.5.1.2
matplotlib : 3.2.1
statsmodels: 0.10.2
re : 2.2.1
pandas : 1.1.0
bambi : 0.6.3

Watermark: 2.3.0

Goodmanngl · May 24, 2022, 10:17am

Sorry for the slow response, I’ve had limited time to work on this recently, but thankfully have made some progress!! I reinstalled the entire environment from scratch using the recommended versions of pymc and Bambi and now it seems to sample at a more reasonable speed (it only works on one core though).

I’ve been playing with some simple negative binomial models in Bambi, and have a question regarding the interpretation of the posterior predictive visualization in Arviz. I know my data is relatively zero inflated, so would expect to see posterior predictive values for zero (as per the observed).

What I see is this:

The posterior predictive mean is similar to the observed data for positive integers, but there appears to be no predictions for “0”. Is this to be expected? Is it just that the “0” predictions are not displayed on the default arviz plot_ppc for negative binomial distribution?

Any advice would be greatly appreciated… I’m still finding my feet here but have now got through “Bayesian analysis with python” and “Bayesian statistics for beginners”, which has at least given me a baseline level of knowledge to work with

tcapretto · May 24, 2022, 5:15pm

Hey again!

Could you share a reproducible example to investigate your problem? From the chart I see that your data has values like 0, 1, 2, 3, etc… but the model is predicting something quite different.

On top of that, I recommend updating to Bambi 0.8.0 which has been released a few days ago.

Topic		Replies	Views
Bambi gets stuck with cores >1 v5 bambi	6	135	August 9, 2024
How to speed up bambi model version agnostic bambi	2	705	April 9, 2023
Bambi MUCH faster then my pymc implementation of Negative Binomial regression v5	3	1019	September 21, 2022
Bambi vs pymc model specification Questions linear_model	5	1325	February 26, 2022
Multiple cores in Bambi predict method v3 bambi	1	537	June 6, 2022

Slow performance using Bambi compared to Example, please help!

Related topics