I am trying to work with automatic imputation in a multivariate model and am seeing that while there previously used to be multivariate imputation, there is currently not.
For example Using this notebook: [pymcon] Missing value handling.ipynb · GitHub
loc = np.array([-2., 4.])
scale = np.array([2., 3.])
corr = np.array([[1., .65],
[.65, 1.]])
cov = np.outer(scale, scale) * corr
chol = np.linalg.cholesky(cov)
n = 500
sample = np.einsum(
"ij,...j->...i",
chol, np.random.randn(n, *loc.shape)
) + loc
n_mask = 50
row_idx = np.random.choice(n, n_mask, replace=False)
col_idx = np.random.binomial(1, .5, size=n_mask)
mask = np.zeros(sample.shape)
mask[row_idx, col_idx] = np.nan
sample_masked = np.ma.array(sample, mask=np.isnan(mask))
masked_element_idx = np.stack(np.where(sample_masked.mask))
with pm.Model() as m4:
mu = pm.Normal('mu', mu=0, sigma=10, shape=2)
sd_dist = pm.HalfCauchy.dist(2.5)
chol_cov, *_ = pm.LKJCholeskyCov(
'chol_cov', n=2, eta=1, sd_dist=sd_dist,
compute_corr=True, store_in_trace=False)
obs = pm.MvNormal(
'obs', mu=mu, chol=chol_cov, observed=sample_masked)
trace4 = pm.sample(2000, cores=4, return_inferencedata=True)
I am getting the current error:
NotImplementedError: Automatic inputation is only supported for univariate RandomVariables, but obs is multivariate .
I am currently running pymc 5.0.0.
Furthermore, if imputation from multivariate random variables has been removed in pymc, I was wondering how I would try and deal with this