I’ve been skimming through Arviz documentation and came across 8 school inference data.
import arviz as az
idata = az.load_arviz_data("centered_eight")
Inference data object also includes school
coordinate. How can I do that with my bambi
models? I always have a dimension column that looks like ..._dim_...
.
Thanks!
P.S. If you want to earn a few points, you can answer it on stackoverflow.
2 Likes
Welcome!
There are definitely resources for how to add them to pymc models (e.g., here and here), but I am unsure about bambi. @tcapretto ?
1 Like
Dimensions are added automatically by Bambi depending on the levels of the variables in the model. Their names are “{variable_name}_dim” as you correctly point out.
I’m not sure I understand your problem though. Do you have an InferenceData
object and you want to modify its dimensions in some way? If that’s the case, it’s more ArviZ/xarray related than Bambi related, but I’m still happy to help if you have an exmaple.
You’re correct. However, it’s way less handy to make plots (and joins) if you don’t have informative dimensions. For example, with this simple example:
import bambi as bmb
import pandas as pd
import arviz as az
df_simple = pd.DataFrame({
'x': ['A', 'B', 'C'],
'y': [10, 20, 30],
'n': [100, 100, 100]
})
m = bmb.Model('p(y, n) ~ 0 + x', data=df_simple, family='binomial')
idata = m.fit(cores=4)
m.predict(idata)
az.plot_forest(idata, var_names='p(y, n)_mean', combined=True)
The plot has numbers instead of group names on y axis. When looking at the documentation, forest plot has correct y-axis labels right out the box. What am I doing wrong / how can I achieve that (without the need to manually rename the axis labels, of course)?
Got it!
If you have a look at idata.posterior
you will see something like the following
Notice the coordinates for p(y, n)_mean
is p(y, n)_obs
, which is just the row number. This is because Bambi doesn’t know each row is paired with one x_dim
(A, B, and C). You can modify the plot after creation though. ArviZ functions usually return an array of matplotlib Axes.
axes = az.plot_forest(idata, var_names='p(y, n)_mean', combined=True)
axes[0].set(yticklabels=["C", "B", "A"], ylabel="Parameter name", xlabel="Posterior distribution");
EDIT Another option is to modify the coords in the xarray.Dataset
idata.posterior = idata.posterior.assign_coords({"p(y, n)_obs": ["A", "B", "C"]})
az.plot_forest(idata, var_names='p(y, n)_mean', combined=True);