I managed to get the model working with auto imputed missing measurements/observations.
masked_measured = np.ma.masked_invalid(measured)
with pm.Model() as model:
# Define priors for the unknown parameters
# ...
priors = tt.stack(priors)[:, None]
# Define the surrogate models for the outputs using the priors defined above
linear = sm_linear * priors
rsm = tt.sum(sm_rsm * priors * priors.T, axis=2)
out_mu = sm_intercept + tt.sum(linear, axis=0) + tt.sum(rsm, axis=1)
loglikelihood = pm.MvNormal(
"loglikelihood", mu=out_mu, cov=true_cov, observed=masked_measured, shape = ([length, length])
)
I derived a simple test when I am providing 2 measurements/observations for each out of length
observed variables. I masked one of them and with known measurement uncertainty after imputation I expected loglikelihood_missing__0
mean to be close to only available measurement for the given variable but they are different. Any ideas why?
I watched and recommend @junpenglao PyMCon talk and analysed the code but I am still struggling to get my head around how the imputation of observed variable is implemented in PyMC.