Zero inflated Normal

archie212 · February 28, 2021, 6:42pm

I would love help in exploring the implementation of a zero inflated Normal. I think of my model as a Mixture model where there is some probability that the outcome variable will be zero. And then if not zero it has some value with distribution mu and std. I would love to model this as a Mixture model but have not been able to find a good implementation. Thanks!

An example of distribution:
Screen Shot 2021-02-28 at 1.41.22 PM

jstanley · March 1, 2021, 7:08pm

It’s unusual to have a zero-inflated continuous distribution (e.g. normal). However, pymc3 provides a number of zero-inflated discrete distributions (Poisson, Binomial, Neg. Binomial), which might be suitable to your data, based on the figure you provided. If your data is continuous, you could transform it by binning and scaling it to the naturals.

Dirk_Nachbar · March 2, 2021, 4:52pm

Spike and slab can be used, it’s often used as a prior but can also be used for likelihood.

or also check the Tweedie distribution

gist.github.com

https://gist.github.com/MatsuuraKentaro/952b3301686c10adcb13

model.stan

data {
  int N;
  int M;
  real<lower=0> Y[N];
}

parameters {
  real<lower=0> mu;
  real<lower=0> phi;
  real<lower=1, upper=2> theta;

This file has been truncated. show original

run-model.R

library(rstan)
library(tweedie)

stanmodel <- stan_model(file='model/model.stan')

N1 <- 200
M1 <- 30
Y1 <- rtweedie(N1, power=1.3, mu=1, phi=1)
data1 <- list(N=N1, M=M1, Y=Y1)
fit1 <- sampling(stanmodel, data=data1)

This file has been truncated. show original

jonsedar · March 8, 2021, 5:11am

I can’t comment on how unusual they are in general, but one can certainly find plenty of zero-inflated continuous dists in insurance claims severity data, and probably anywhere there’s conditional outcomes on events.

McElreath has a nice example here http://xcelab.net/rmpubs/Mcelreath%20Koster%202014.pdf for a zero-inflated gamma likelihood on the quantity of meat returned from hunting expeditions. I used this model principle with success for a zero-inflated lognormal likelihood.

ricardoV94 · March 8, 2021, 6:38am

There is a good discussion here: Zero-Inflated models in Stan - General - The Stan Forums

A mixture between continuous and discrete is not really a mixture and more a model with two outcomes/ likelihoods (one binomial for the discrete zeros and one continuous for the rest). Since there is no crosstalk between the two components one can model them separately or ignore one altogether (e.g drop the zeros) without loss of information for the kept parameters.

One exception is if there is considerable rounding going on that the (now) discretized continuous distribution can actually generate zeros.

jonsedar · March 8, 2021, 7:31am

Thanks for the link - will read. I thought it interesting the McElreath’s model treats the mixing parameter pi as continuous (he uses a Normal with invlogit link), presumably to keep parameter space smooth and let his MvN priors work…

EDIT:

Also someone on that thread called such a mixture “zero-augmented” to indicate of course that the continuous distance of choice might not have any support on zero. I quite like that terminology, will use.

Topic		Replies	Views
Likelihood for regression problem in which the response is continous and zero-inflated, mixtures? any examples? v5	2	615	April 7, 2023
Literature on implementing a Zero-Inflated Beta likelihood Questions	19	2287	January 18, 2019
Modeling Zero-Inflation on continuous outcome Questions	6	1952	November 11, 2024
Zero One Inflated Beta Regression Questions	14	1240	January 26, 2024
Sampling from a learned mixture of zeros and lognormal Questions	8	1452	August 8, 2019

Zero inflated Normal

Related topics