Doubt regarding az.plot_ppc(trace)

Nij_PADARIYA · February 3, 2025, 5:46am

in image 1 as my observed value are always greater than zero still i took likelihood as normal distribution

in image 2 i changed my likelihood to exponential distribution and i also tried with gamma also

now my question is as in image 1 , i have extra predictions which are useless but when i see image 2 than i am not able to cover the observed distribution line

for image 2’s model i tried by changing many different params and values of prior and also distribution of the same but still not able to cover whole graph

so which is correct and which is not ?
what should i use

Keith_Min · February 4, 2025, 7:22pm

Hi!

Can you include the code that specifies the model, as well as the code you used to plot the posterior?

Thanks,
Keith

cluhmann · February 4, 2025, 8:37pm

What do you mean by this?

There are not “correct” or “incorrect” models. There are models you are happy with and one you are unhappy with. My suggestion would be to do some prior predictive checking (not posterior predictive checking). That will help you figure out if your model is behaving in ways that make sense to you.

Nij_PADARIYA · February 5, 2025, 8:52am

this is my previous doubt where i mentioned about my model and data

Nij_PADARIYA · February 5, 2025, 8:53am

Can you check now this based on my data and model?

cluhmann · February 5, 2025, 12:25pm

I can, but I would need to see what the prior predictive looks like and even then, you would be the best person to evaluate whether the model is behaving in the way you wish.

Keith_Min · February 8, 2025, 12:38am

I took a look at your model. It’s hard to tell what your graph is actually showing though. Could you paste your code for the second visualization? If it’s just showing a distribution plot of all Y_obs, I’m not sure how meaningful it is for figuring out what’s going on. I will say, though, your data seem to contain a lot of zeros and an exponential distribution probably won’t work for modeling counts with excess zeros (or any count data, for that matter).

To answer your question… like @aseyboldt said in your original post, I would recommend using a Negative-Binomial distribution. It’s like the Poisson distribution for count data, except allows you to handle overdispersion.

Nij_PADARIYA · February 9, 2025, 2:36pm

it is not containing more zeros but the y_obs is mostly containing values near zero,

Keith_Min · February 10, 2025, 4:46am

I mean the actual observed values. Didn’t you say they were counts? I’m assuming y has lots of zeros?

Nij_PADARIYA · February 10, 2025, 10:07am

yes they are count of disease case but when i am normalizing it then it will become value near to zero

Keith_Min · February 10, 2025, 2:20pm

If you’re normalizing the data, then you definitely shouldn’t use an exponential distribution. The exponential distribution cannot model negative numbers. Consider leaving the counts as they are and doing something like this: GLM: Negative Binomial Regression — PyMC example gallery

Nij_PADARIYA · February 10, 2025, 2:25pm

what if we are using gamma instead of negative binomial
as for different regions have different populations so we have to find per 100k and that is our normalization

JAB · February 10, 2025, 3:39pm

I think this point has been made several times along the thread, but you really need to look at the prior predictive to know if your model is behaving as expected. If it does not generate the data you expect to see, then there is no way for your posterior to be any different. You seem to be unhappy with the high probability of zero in your PPC. However, as this is completely expected with an exponential. You can certainly normalize to per 100k and then use a different distribution, including gamma. However, again, depending on your priors for alpha/beta, you may still have a model with high probability near zero. It seems that in the model link you posted above, you are defining alpha=1. This is likely inconsistent with that you expect from your data. Are you expecting something that looks more like a normal distribution centered around some positive value? Take a look at the visualization and see what you think:
https://www.pymc.io/projects/docs/en/stable/api/distributions/generated/pymc.Gamma.html

Do you want a fixed value of alpha or would you consider placing priors on both alpha/beta? You can also parameterize the gamma using mu/sigma if that makes more sense for your application.

Topic		Replies	Views
Modeling count time series (Negative Binomial VS Normal) Questions	2	1033	March 25, 2020
Sample_prior_predictive doesn't work with Negative Binomial likelihood Questions	4	453	March 31, 2020
Interpreting beta-binomial parameterisation/posterior Questions	2	689	August 6, 2020
Interpretation of posterior predictive checks for a Gaussian Process v5 gaussian_process , modeling , arviz	11	1180	August 24, 2023
Posterior Predictive distributions: beta-binomial models Questions	1	653	March 10, 2021

Doubt regarding az.plot_ppc(trace)

Related topics