Az.plot_ppc() and coords

Runing on PyMC v5.3.0
Runing on arviz v0.15.1
Hi,
Hopefully somebody can see how to solve what seems like a simple problem of coords and ppc,s. which is to use az.plot_ppc(idata,coords) to plot ppcs for separate groups.

The problem appears solved in this example from : arviz.plot_ppc — ArviZ 0.16.1 documentation for getting az.plot_ppc() on indidual groups.

idata_radon = az.load_arviz_data('radon')
az.plot_ppc(idata_radon, data_pairs={"y":"y"},num_pp_samples=20)
obs_county = idata_radon.posterior["County"][df1.constant_data["county_idx"]]
idata_radon = idata_radon.assign_coords(obs_id=obs_county, groups="observed_vars")
az.plot_ppc(idata_radon, coords={'obs_id': ['ANOKA', 'BELTRAMI']}, flatten=[],num_pp_samples=20) 

this 5 lines of code work perfectly for the radon idata.

The example I am working on is from Coordinates in PyMC & InferenceData Objects – Christian Luhmann – Python, math, etc. (cluhmann.github.io) which is explaining the use of coords in pymc.

This is the adapted code:

# CREAT SOME KNOWN  BLOOD_PRESSURE DATA (BP) FOR TWO GROUPS (MALES AND FEMALES)

rng = np.random.default_rng(101010)

group_size = 100
systolic_f = np.random.normal(loc=123, scale=5, size=group_size)
systolic_f =rng.normal(loc=120, scale=2, size=group_size)#123
systolic_m = rng.normal(loc=150, scale=2, size=group_size)#127

systolic = np.hstack((systolic_f, systolic_m))
gender = (['female'] * group_size) + (['male'] * group_size)



df = pd.DataFrame({'bp':systolic, 'gender':gender})

df.reset_index()
df.head()

cat=pd.Categorical(df['gender'])

with pm.Model(coords = {'gender':['female','male']}) as labeled_model:
    mu = pm.Normal('mu', 140, sigma=15, dims='gender')
    
    likelihood = pm.Normal('likelihood', 
                           mu=mu[cat.codes],
                           sigma=5, 
                           observed=df['bp'])
    idata_coords=pm.sample(chains=2)
    idata_coords.extend(pm.sample_posterior_predictive(idata_coords))
az.plot_ppc(idata_coords,num_pp_samples=20) 

this line works but to get the ppc plot for the ‘male’ and ‘female’ groups separately does not work!

az.plot_ppc(idata_coords,coords={'gender':['female','male']}) 

This last line of code does not work. Could anyone tell me how to write the last line of code using az.plot_ppc() and the coords to get plot predictions for the individual groups as appears to have been achieved in the Radon example above. Thank you in advance for examining code.

The ppc plot does not use the mu variable but the likelihood one, which is the one that can be compared to the observations. And likelihood has no information about any dimensions or coordinates, so it can’t know which observations correspond to which gender.

This is also what happens in the radon example in plot_ppc docstring. We need to assign the coordinate values of what we want before calling plot_ppc. You’ll have to follow the same steps, but the obs_county won’t be an array with the county that correspond to each observation but the genders of each observation, in your case I think cat.categories