Using az.compare With Bambi Models

Hi Friends,

I am trying to fix several models with Bambi and then compare them with Arviz.compare function. I have seen lots of examples of using az.compare directly with PyMC, but none with Bambi. Unfortunately, it does not work as expected with Bambi:

“TypeError: log likelihood not found in inference data object”

I found this thread, which looks like it might apply. I tried to follow that thread, but ran into trouble and I wasn’t even sure I was on the right track.

Here is a minimal example. The Bambi models fit fine, but then az.compare fails. What am I missing??? :smiley:

My versions are:

My error is:

My code is:

import numpy as np
import pandas as pd

import arviz as az
import bambi as bmb
import pymc as pm

import sys
sys.version

az.__version__

bmb.__version__

pm.__version__

penguins = pd.read_csv("../data/penguins.csv")

missing_data = penguins.isnull()[
    ["bill_length_mm", "flipper_length_mm", "sex", "body_mass_g"]
].any(axis=1)

penguins = penguins.loc[~missing_data].reset_index(drop=True)

print(penguins.shape)
penguins.sample(5)

species_model = bmb.Model("body_mass_g ~ 0 + species", penguins)
species_model.build()

species_model_results = species_model.fit()
species_model.predict(species_model_results, kind='pps', inplace=True)

species_model_results

species_one_covariate_model = bmb.Model("body_mass_g ~ 0 + species + flipper_length_mm", penguins)
species_one_covariate_model.build()

species_one_covariate_results = species_one_covariate_model.fit()
species_one_covariate_model.predict(species_one_covariate_results, kind='pps', inplace=True)

species_one_covariate_results

compare_dict = {
    'model_one':species_model_results,
    'model_two':species_one_covariate_results
}

compare_dict

az.compare(compare_dict)
1 Like

At some point in the past, the default behavior of pm.sample was changed so that point-wise log-likelihoods are no longer computed and stored for every sample in the trace (it’s time and memory intensive and a lot of people don’t need it).

For az.compare you do need it, so you have two options:

  1. When you call model.fit, add the argument idata_kwargs=dict(log_likelihood = True). This signals to PyMC that you would like the log-likelihood to be computed and added to your idata after sampling is complete.
  2. Use idata = pm.compute_log_likelihood(idata), which will compute the log likelihoods, add them to your idata, and give it back.
3 Likes

Worked like a charm. Thank you, Jesse.

Just wanted to note this solution also works for nutpie samples. I had to run nutpie using the compile_pymc_model() approach because my model has BART variables. I couldn’t find a way to request the loglikelihoods directly through the nutpie API. This solution worked for me.

compiled_model = nutpie.compile_pymc_model(bart_model)
trace = nutpie.sample(compiled_model, chains=6, tune=3000, draws=2000, target_accept=0.93)
with bart_model:
    idata = pm.compute_log_likelihood(trace)
1 Like