How to get the corresponding index values of Arviz.InferenceData.sel()

Hi,

I read in the docs that I could use arviz.InferenceData.sel to create a subsample.

import arviz as az

idata = az.load_arviz_data("centered_eight")
sample = idata.sel(group='Choate')

I would like to get the index value where group='Choate'.

The idea is to create a new variable like below

post = idata.posterior
post["new_variable"] = 0

Next I want to run a function with a boolean outcome for each group and assign the value of this function to the corresponding index values. So instead of creating a subsample I want the index/coordinates of the subsample and update the corresponding values within the main sample

How can I achieve this?

Thank you

Maybe it would help to explain what you are trying to accomplish. This sort of thing:

post["new_variable"] = 0

isn’t going to work because variables in the posterior have multiple dimensions (e.g., group, chain, draw, etc.). So saying that the new variable should take on a value of 0 doesn’t provide enough values (or doesn’t provide a value of the appropriate share) to be inserted into the posterior.

Okay then I could right something like this

post["new_variable"] = post["existing_variable"] * 0

The goal is to retrieve the index coordinates within post that belong to a certain group.
I would like to loop over the groups, get the index coordinates, run a function that returns 1 or 0 and update post["new_variable"].

So I would like to stay within the sample instead of creating subsamples like arviz.InferenceData.sel()

I guess I am a bit confused because “group” here seems to refer to 2 different things. In inferenceData objects, “group” refers to the various xarray objects that are stored (e.g., idata.posterior, idata.prior, idata.prior_predictive, etc.). But I think you are referring to the dimension “group”, which takes on 8 values (the name of each school). Is that right? If so, that I am unclear what an “index coordinate” is when you say this:

If you simply want to manipulate values in the posterior group, you should be able to use the coordinates themselves (rather than integer indices). Something along these lines:

idata.posterior.loc[dict(group="Choate")]["new_variable"] = my_array

or:

idata.posterior.loc[{"group":"Choate"}]["new_variable"] = my_array

Where the my_array is an array of shape:

idata.posterior.sel(group="Choate")["new_variable"].shape

Does that get you what you want?

2 Likes