I read in the docs that I could use arviz.InferenceData.sel to create a subsample.
import arviz as az
idata = az.load_arviz_data("centered_eight")
sample = idata.sel(group='Choate')
I would like to get the index value where group='Choate'.
The idea is to create a new variable like below
post = idata.posterior
post["new_variable"] = 0
Next I want to run a function with a boolean outcome for each group and assign the value of this function to the corresponding index values. So instead of creating a subsample I want the index/coordinates of the subsample and update the corresponding values within the main sample
Maybe it would help to explain what you are trying to accomplish. This sort of thing:
post["new_variable"] = 0
isn’t going to work because variables in the posterior have multiple dimensions (e.g., group, chain, draw, etc.). So saying that the new variable should take on a value of 0 doesn’t provide enough values (or doesn’t provide a value of the appropriate share) to be inserted into the posterior.
The goal is to retrieve the index coordinates within post that belong to a certain group.
I would like to loop over the groups, get the index coordinates, run a function that returns 1 or 0 and update post["new_variable"].
So I would like to stay within the sample instead of creating subsamples like arviz.InferenceData.sel()
I guess I am a bit confused because “group” here seems to refer to 2 different things. In inferenceData objects, “group” refers to the various xarray objects that are stored (e.g., idata.posterior, idata.prior, idata.prior_predictive, etc.). But I think you are referring to the dimension “group”, which takes on 8 values (the name of each school). Is that right? If so, that I am unclear what an “index coordinate” is when you say this:
If you simply want to manipulate values in the posterior group, you should be able to use the coordinates themselves (rather than integer indices). Something along these lines: