Plot_ppc() with discrete data - How to control bins?

red_falcon · February 5, 2021, 3:50pm

So I have some count data that is overdispersed and I want to visually show how a negative binomial model is giving a better fit than a poisson model. The issue is that plot_ppc() is using varying bin sizes in an inconsistent manner and I can’t seem to find a way to pass some sort of kw_arg to fix this.

Here are some example images so you can see what I mean:
Poisson plot:

Negative Binomial plot:

You can see how most of the Poisson distributions are using more bins than the observed data and the Negative binomial distribution. If you look closely though you can see some of the light-blue/alpha Negative binomial fits use more bins, and some of the Poisson ones use fewer bins.

Does anyone know of some way that I can make the number of bins fixed/consistent? This doesn’t seem to be an issue when I plot the cumulative distributions, which is fine from a technical point of view, but I really like these pdf views as they make you appreciate how the Poisson fit finds the mean of the distribution pretty well but it struggles with the spread of the data since for Poisson mean=var.

Here is some example code of how I am creating the above plots right now:

with pois_model:
    ppc = pm.sample_posterior_predictive(pois_trace)
    ax=az.plot_ppc(az.from_pymc3(posterior_predictive=ppc, model=pois_model),
                   kind='kde', var_names=['home_points'], num_pp_samples=2000)
    ax[0].set_xlim(-5, 20)

OriolAbril · February 5, 2021, 10:46pm

It is not possible right now, one (or maybe a couple) kwarg arguments should be added to make that possible.

If you are interested in working on this I can guide you through the process to submit a PR to ArviZ.

Topic		Replies	Views
Unexpected Arviz PPC plot version agnostic arviz	8	2092	February 18, 2022
Trouble understanding sample_ppc Questions	4	639	June 7, 2018
How to plot a posterior check of a bivariate variable v5 arviz	2	36	June 9, 2025
Poisson Binomial in PyMC3 Questions	5	803	February 13, 2022
Fitting multiple measurements, shape issues in sample_ppc Questions	4	1435	November 8, 2018

Plot_ppc() with discrete data - How to control bins?

Related topics