How to pass a coordinate to inference data

mihagazvoda · December 5, 2022, 7:29pm

I’ve been skimming through Arviz documentation and came across 8 school inference data.

import arviz as az
idata = az.load_arviz_data("centered_eight")

Inference data object also includes school coordinate. How can I do that with my bambi models? I always have a dimension column that looks like ..._dim_....
Thanks!

P.S. If you want to earn a few points, you can answer it on stackoverflow.

cluhmann · December 6, 2022, 2:08am

Welcome!

There are definitely resources for how to add them to pymc models (e.g., here and here), but I am unsure about bambi. @tcapretto ?

tcapretto · December 8, 2022, 12:51am

Dimensions are added automatically by Bambi depending on the levels of the variables in the model. Their names are “{variable_name}_dim” as you correctly point out.

I’m not sure I understand your problem though. Do you have an InferenceData object and you want to modify its dimensions in some way? If that’s the case, it’s more ArviZ/xarray related than Bambi related, but I’m still happy to help if you have an exmaple.

mihagazvoda · December 8, 2022, 7:49pm

You’re correct. However, it’s way less handy to make plots (and joins) if you don’t have informative dimensions. For example, with this simple example:

import bambi as bmb
import pandas as pd
import arviz as az

df_simple = pd.DataFrame({
    'x': ['A', 'B', 'C'],
    'y': [10, 20, 30],
    'n': [100, 100, 100]
})

m = bmb.Model('p(y, n) ~ 0 + x', data=df_simple, family='binomial')
idata = m.fit(cores=4)

m.predict(idata)

az.plot_forest(idata, var_names='p(y, n)_mean', combined=True)

The plot has numbers instead of group names on y axis. When looking at the documentation, forest plot has correct y-axis labels right out the box. What am I doing wrong / how can I achieve that (without the need to manually rename the axis labels, of course)?

tcapretto · December 13, 2022, 1:23am

Got it!

If you have a look at idata.posterior you will see something like the following

Notice the coordinates for p(y, n)_mean is p(y, n)_obs, which is just the row number. This is because Bambi doesn’t know each row is paired with one x_dim (A, B, and C). You can modify the plot after creation though. ArviZ functions usually return an array of matplotlib Axes.

axes = az.plot_forest(idata, var_names='p(y, n)_mean', combined=True)
axes[0].set(yticklabels=["C", "B", "A"], ylabel="Parameter name", xlabel="Posterior distribution");

EDIT Another option is to modify the coords in the xarray.Dataset

idata.posterior = idata.posterior.assign_coords({"p(y, n)_obs": ["A", "B", "C"]})
az.plot_forest(idata, var_names='p(y, n)_mean', combined=True);

Topic		Replies	Views
Lm_plot with bambi version agnostic bambi , arviz	5	730	April 15, 2023
How to get the labels for the predictive distribution? version agnostic bambi	2	653	May 31, 2022
How to specify coordinate names in summary() Questions	2	567	May 6, 2020
Unexpected behavior with arviz.plot_hdi() with categorical x bambi , arviz	3	92	January 22, 2025
Using az.compare With Bambi Models	3	809	August 7, 2024

How to pass a coordinate to inference data

Related topics