Cannot plot ppc for specific sub_groups

I have this model which seems to perform fairly well, i’d like to plot_ppc for specific sub_groups though, which i can’t figure out how to do using az.plot_ppc. How do I create a ppc plot for each sub_group?

Create mappings

subgroup_to_prodgroup = first_deliveries_df[[‘groupCode2’, ‘groupCode1’]].drop_duplicates()

Reindex using factorize indices

_, sub_groups = pd.factorize(subgroup_to_prodgroup[‘groupCode2’])
_, prod_groups = pd.factorize(subgroup_to_prodgroup[‘groupCode1’])

Create mapping: for each subgroup, which product group index

subgroup_to_prodgroup_idx = pd.Series(prod_groups).values

Factorize the product groups and subgroups first

prod_group_idx, prod_groups = pd.factorize(first_deliveries_df[‘groupCode1’])
sub_group_idx, sub_groups = pd.factorize(first_deliveries_df[‘groupCode2’])

For each sub_group, find corresponding prod_group index

subgroup_df = first_deliveries_df[[‘groupCode1’, ‘groupCode2’]].drop_duplicates()
subgroup_df[‘prod_group_idx’] = pd.factorize(subgroup_df[‘groupCode1’])[0]
subgroup_df[‘sub_group_idx’] = pd.factorize(subgroup_df[‘groupCode2’])[0]

Create an array: subgroup_idx → prod_group_idx

mapping_array = subgroup_df.sort_values(‘sub_group_idx’)[‘prod_group_idx’].values

coords = {‘sub_groups’: sub_groups, ‘prod_groups’ : prod_groups}

with pm.Model(coords=coords) as lead_times_model:
# Hyper priors
mu = pm.Normal(“mu”, mu=100, sigma=30)
sigma = pm.HalfCauchy(“sigma”, beta=30)

# Parameters for Groups
mu_group = pm.Normal('mu_group', mu=mu, sigma=30, dims="prod_groups")
sigma_group = pm.HalfCauchy('sigma_group', beta=30, dims="prod_groups")

# Parameters for Subgroups
mu_sub_group = pm.Normal('mu_sub_group', mu=mu_group[mapping_array], sigma=30, dims="sub_groups")
sigma_sub_group = pm.HalfCauchy('sigma_sub_group', beta=30, dims="sub_groups")

# Noise term
sigma_obs = pm.HalfCauchy("sigma_obs", beta=30)

mu_exp = pm.Deterministic('mu_exp', pm.math.exp(mu_sub_group), dims="sub_groups")

# Likelihood
lead_time_obs = pm.LogNormal(
    "lead_time_obs", 
    mu=mu_sub_group[sub_group_idx], 
    sigma=sigma_obs,
    observed=first_deliveries_df['actualLeadTimeDays'].values
)

# Sampling
idata_lt = pm.sample(return_inferencedata=True)
idata_lt.extend(pm.sample_posterior_predictive(idata_lt))

Look at the third-last and second-last charts here: arviz.plot_ppc — ArviZ dev documentation. You have to use the index from constant data, and flatten.

thanks @zweli

So this should work?

obs_sub_group = idata_lt.posterior[“sub_groups”][idata_lt.constant_data[“sub_group_idx”]]
idata_lt = idata_lt.assign_coords(obs_id=obs_sub_group, groups=“observed_vars”)
az.plot_ppc(idata_lt, coords={‘obs_id’: [‘1107’, ‘1110’]}, flatten=)

Cause i get the error
AttributeError: ‘InferenceData’ object has no attribute ‘constant_data’