Plot posterior predictive several coordinates

Dear PYMC3 community,
I am trying to study do a posterior_predictive test of a simple inference problem but the model has 37 dimensions (‘resid’) and I am not sure how to get the 37 posterior_predictive plots. This is my model:

beta_prior = 4
sd_prior = 1
coords = {
    'resid': df_3fb5.columns,
    'step': df_3fb5.index
}
with pm.Model(coords=coords) as my_model:
    mu_fo = pm.Normal('mu_fo', mu=df_5vk6.mean(), sd=sd_prior, dims='resid')
    sd_fo = pm.HalfCauchy('sd_fo', beta=beta_prior, dims='resid')
    
    like_fo = pm.Normal('like_fo',
                        mu=mu_fo,
                        sd=sd_fo,
                        #dims='resid',
                        observed=df_5vk6)
with my_model:
    trace = pm.sample(2000, tune=1000, random_seed=RANDOM_SEED)
    ppc = pm.sample_posterior_predictive(
    trace, var_names=["like_fo"], random_seed=RANDOM_SEED,
    )
az.plot_ppc(my_model, var_names=["like_fo"], group='posterior')

The only thing I see with the plot is the ppc of the full data set, not the breakdown per coordinate (resid). If I try to add dim=‘resid’ to the likely_hood I get a shape error saying it expects a shape of 1000 which is the number of data points per coordintes.

Can anyone help me? Thank you very much in advance!
Best,
Sergio

1 Like

Hi Sergio,
I think you’ll have to call az.plot_ppc multiple times and organize your plots on different matplotlib axes in the same figure.

Since you’re already using dims and coords to parametrize your model, I think you should make use of the return_inferencedata kwarg in pm.sample and set it to True. That way, you’ll get an ArvIZ InferenceData object directly and will be able to select your dims and coords more intuitively for posterior handling – see the updated radon NB for several examples.
Most ArviZ functions have kwargs to directly use these dims and coords :wink:

Hope this helps :vulcan_salute:

This can be done using the flatten argument of plot_ppc.

You should pass an empty list as flatten to keep each resid as a different subplot. There is one example in the API docs https://arviz-devs.github.io/arviz/generated/arviz.plot_ppc.html#arviz.plot_ppc on how is flatten used. Unfortunately however, the corresponding image is not shown in the docs. I hope this means we have introduced a bug in the development version and that said bug is not in the latest release so the code works correctly for you and generates a plot even if the docs show no image.

I have just realized where my problem was with the likelyhood, I had to give it the two dimensions:

    like_fo = pm.Normal('like_fo',
                        mu=mu_fo,
                        sd=sd_fo,
                        dims=('step','resid'),
                        observed=df_5vk6)

@AlexAndorra
I didn’t know about the return_inferencedata option. This is very handy. Can I append to the xarray that is generated the ppc data so I can save it all together to disk?

@OriolAbril
Not having the image made me miss it, good thing you guys are aware of this. I am still running into a small problem. If I do this:

az.plot_ppc(my_model, flatten = [])
```python
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-39-ddd74f183c7b> in <module>
----> 1 az.plot_ppc(my_model, flatten = [])

~/data_partition/bin2/anaconda3/envs/nmr_assign_state/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
    336     # TODO: Add backend kwargs
    337     plot = get_plotting_function("plot_ppc", "ppcplot", backend)
--> 338     axes = plot(**ppcplot_kwargs)
    339     return axes

~/data_partition/bin2/anaconda3/envs/nmr_assign_state/lib/python3.8/site-packages/arviz/plots/backends/matplotlib/ppcplot.py in plot_ppc(ax, length_plotters, rows, cols, figsize, animated, obs_plotters, pp_plotters, predictive_dataset, pp_sample_ix, kind, alpha, linewidth, mean, xt_labelsize, ax_labelsize, jitter, total_pp_samples, legend, group, markersize, animation_kwargs, num_pp_samples, backend_kwargs, show)
    160                     new_d = np.zeros((rep, len_density))
    161                     bins = np.digitize(pp_xs, new_x, right=True)
--> 162                     new_x -= (new_x[1] - new_x[0]) / 2
    163                     for irep in range(rep):
    164                         new_d[irep][bins[irep]] = pp_densities[irep]

IndexError: index 1 is out of bounds for axis 0 with size 1

I get something similar if I add coords={'resid': [64]}.

1 Like

This works! You have to flatten the other dimension.

az.plot_ppc(my_model, flatten=['step'])

Thanks for pointing me in the right direction. In any case, I am not sure if this could be made more intuitive or if I am just too much of a noob.

2 Likes

Short answer: yes starting from next release

Long answer: in a couple days in ArviZ development version. We recently added capabilities to ease this in ArviZ, mainly with InferenceData.extend for which I have been working on a tutorial blogpost to be published this week. While writing the post I realized there was a bug in extend which will be fixed by this already open PR.

Glad to be of help! I actually was not sure it would work given that there must be an error of some kind for docs to not be properly generated.

1 Like

Great! I look forward to it.
In any case, should I post arviz questions here or on the github? This is technically a pymc3 forum.

We still have no ArviZ specific forum (yet?) so feel free to ask here, on Stan forums or in ArviZ gitter/github issues.

We have also just released ArviZ 0.10.0 with a fix for the bug I mentioned above in extend