Automatic imputation - inner workings question

Chmielar · February 15, 2022, 11:25pm

I managed to get the model working with auto imputed missing measurements/observations.

masked_measured = np.ma.masked_invalid(measured)

with pm.Model() as model:
    
    # Define priors for the unknown parameters
    # ...
    priors = tt.stack(priors)[:, None]

    # Define the surrogate models for the outputs using the priors defined above
    linear = sm_linear * priors
    rsm = tt.sum(sm_rsm * priors * priors.T, axis=2)
    out_mu = sm_intercept + tt.sum(linear, axis=0) + tt.sum(rsm, axis=1)

    loglikelihood = pm.MvNormal(
        "loglikelihood", mu=out_mu, cov=true_cov, observed=masked_measured, shape = ([length, length])
    )

I derived a simple test when I am providing 2 measurements/observations for each out of length observed variables. I masked one of them and with known measurement uncertainty after imputation I expected loglikelihood_missing__0 mean to be close to only available measurement for the given variable but they are different. Any ideas why?

I watched and recommend @junpenglao PyMCon talk and analysed the code but I am still struggling to get my head around how the imputation of observed variable is implemented in PyMC.

Topic		Replies	Views
Masking missing values of predictors Questions	3	1342	July 10, 2020
Automatic Imputation of Multivariate models Questions	4	723	December 19, 2022
Automatic imputation - array dimension problem Questions	2	669	February 10, 2022
Dealing with random missing values in a GLM model v5 modeling	0	303	July 18, 2023
Disabling missing data imputation Questions	17	2200	October 10, 2023

Automatic imputation - inner workings question

Related topics