What is the best way to estimate theta and psi for ZeroInflatedPoisson?


I need to run a model on a target variable that is ZeroInflatedPoisson. I see from Discrete — PyMC3 3.11.5 documentation that the distribution takes parameters theta and psi. I have read that I should estimate theta as number of samples == 0/number of samples.

What is the best way to set psi? The document says psi is Expected proportion of Poisson variates (0 < psi < 1). I don’t know know what that means.

Also, how would I use this with a regression model? Is the regression formula used for the parameter theta?

I think this is backwards. I found this discussion of the ZIP parameteres helpful.

Theta is the usual rate parameter of a poisson distribution. You can confirm this by direct comparison of the 2nd branch of the ZIP PDF shown on the page you linked, and the PDF of a normal Poisson (which you recover when phi = 1).

So you want to estimate theta as usual in a normal Poisson setup: do some stuff to get logits, then set theta = exp(logits).

Psi is related to how many “extra” zeros you want to generate. As noted above, when phi = 1, you get a standard Poisson PDF, so all the zeros in your data are assumed to come from a Poisson. When phi = 0, you are saying that your Poisson process NEVER ONLY generates zeros.

So you could set psi ~ Beta(1, 1) if you want to be uninformative, or you could try to look at your data and try to skew it one way or the other, but you’ll probably want to use some kind of beta prior.

1 Like

You are actually saying you can only generate zeros, all draws come from the zero inflated component.


Yes you are exactly right. I missed the \psi in front of the Poisson process in the x \neq 0 branch of the PDF. Thank you for correcting me.

Thank you. When you say “logits,” I think of a quantile function to get a probability of a binary outcome. Is that what you mean? How does that reconcile to trying to forecast count data?

I just mean some quantity you compute in \mathbb R^n before you put it through a linking function. I don’t know what the right word for this is in Bayesian lingo, I took this word “logits” from deep learning literature. Latent variable? Latent value?

Ha! I get so confused with using the right “lingo”. Thankyou!