Mechanics of missing value imputation

matsuo_basho · April 15, 2023, 5:02pm

I’m having a hard time understanding what goes on under the hood during missing value imputation. I understand that we set priors (usually using a normal distribution) and then use the non-missing data as the likelihood. However, what’s really confusing me is why each missing value in the output gets its own mean and standard deviation. Since these values are missing, there’s nothing really differentiating them, so why would the mean and standard deviation not be the same for all the missing values?

fonnesbeck · April 16, 2023, 12:44am

It depends on what the other inputs to the likelihood/sampling distribution are. In the very simple case that the likelihood just has a scalar mean, then all of the missing values would have the same mean and standard devation:

fake_data = np.random.normal(2, 5, size=100)
fake_data[:3] = np.nan

with pm.Model() as m:

    mu = pm.Normal('mu', 0, 10)
    sigma = pm.HalfNormal('sigma', 20)

    x = pm.Normal('x', mu, sigma, observed=fake_data)

    trace = pm.sample(2000, cores=2)

So if we look at the posterior of the three missing values in this dataset:

If I ran the sampler longer they would converge to the same values.

However, in many models the mean might be a linear model or some other hierarchical structure with covariates, where the expected value would generally be different for every observation, in which case we would not expect the imputed values to be the same.

matsuo_basho · April 18, 2023, 12:23am

Thanks for the example, I understand the explanation partly. Regarding different values when we have a hierarchical structure, why would the expected values be different for every observation? We don’t have the observation, so don’t know which group in the hierarchy the value belongs to. So we would assign the same expected value based on the weighted average of the effect of each group with the probability of the missing value being in the group.

ricardoV94 · April 18, 2023, 5:49am

You can have predictors at the hierarchical levels as well, but if your truly don’t have any information that distinguishes them, then you’re correct that they will all be the same.

Topic		Replies	Views
What does the hierarchical model look like when having missing in observed? Questions	12	1150	October 31, 2018
Dealing with random missing values in a GLM model v5 modeling	0	304	July 18, 2023
Automatic imputation - inner workings question Questions	0	416	February 15, 2022
Can't seem to `sample_prior_predictive` on model with missing value imputation Questions	0	451	March 31, 2021
Missing Data Imputation - Obscurities Questions	0	558	January 18, 2022

Mechanics of missing value imputation

Related topics