Zero inflated Normal

I would love help in exploring the implementation of a zero inflated Normal. I think of my model as a Mixture model where there is some probability that the outcome variable will be zero. And then if not zero it has some value with distribution mu and std. I would love to model this as a Mixture model but have not been able to find a good implementation. Thanks!

An example of distribution:
Screen Shot 2021-02-28 at 1.41.22 PM

It’s unusual to have a zero-inflated continuous distribution (e.g. normal). However, pymc3 provides a number of zero-inflated discrete distributions (Poisson, Binomial, Neg. Binomial), which might be suitable to your data, based on the figure you provided. If your data is continuous, you could transform it by binning and scaling it to the naturals.

Spike and slab can be used, it’s often used as a prior but can also be used for likelihood.

or also check the Tweedie distribution

1 Like

I can’t comment on how unusual they are in general, but one can certainly find plenty of zero-inflated continuous dists in insurance claims severity data, and probably anywhere there’s conditional outcomes on events.

McElreath has a nice example here for a zero-inflated gamma likelihood on the quantity of meat returned from hunting expeditions. I used this model principle with success for a zero-inflated lognormal likelihood.

There is a good discussion here: Zero-Inflated models in Stan - General - The Stan Forums

A mixture between continuous and discrete is not really a mixture and more a model with two outcomes/ likelihoods (one binomial for the discrete zeros and one continuous for the rest). Since there is no crosstalk between the two components one can model them separately or ignore one altogether (e.g drop the zeros) without loss of information for the kept parameters.

One exception is if there is considerable rounding going on that the (now) discretized continuous distribution can actually generate zeros.

Thanks for the link - will read. I thought it interesting the McElreath’s model treats the mixing parameter pi as continuous (he uses a Normal with invlogit link), presumably to keep parameter space smooth and let his MvN priors work…


Also someone on that thread called such a mixture “zero-augmented” to indicate of course that the continuous distance of choice might not have any support on zero. I quite like that terminology, will use.