Working with Multi-index Coords

AlexAndorra · November 14, 2024, 11:04pm

Yeah I share your pain @dreycenfoiles , I also work with Multiindex all the time

I have a workflow similar to yours that I tend to use, but very recently I ended throwing that away and doing something much simple. Just use something like:

"obs_id": [
    f"{date}_{loc}_{state}" for date, state, loc in Y[["date", "loc", "state"]].to_numpy()
]

Then, you can index your InferenceData object with isel, using your Y dataframe with the specific combinations of date and location (for instance) that you want:
idata.isel(obs_id=Y[select_with_pandas].index)

So yeah, I’m basically outsourcing the selection to Pandas
But the fact that xarray can’t handle selection on Multiindex (at least that I know of) is a huge limitation for me in these cases.

Another advantage of this solution is that it avoids unstack or reset_index("obs_id") which are both computationally intensive – if you have big data, your RAM will break down

Hope this helps

Topic		Replies	Views
PyMC+ArviZ: how to make the most of labeled coords and dims in PyMC 4.0 Sharing doc , arviz	9	1993	November 11, 2022
Understanding coords, indexation, Data, ..., for multilevel models v5 modeling	1	4144	April 29, 2022
Assign multiindex later to data variable v5 modeling	0	219	November 4, 2023
PyMC3+ArviZ: improve your workflow with labeled coords and dims Sharing doc	20	5921	April 5, 2021
Indexing constantdata by label v5	0	133	January 24, 2024

Working with Multi-index Coords

Related topics