There are not many ways to debug the error in sample_ppc. For example, in your case it is essentially doing y_modeled.random(trace[random_index])
, which will return the same error message so it doesnt help.
I would probably just write my own OOS prediction code instead of building a new model, which is much faster and you also have better control:
nppc = 100
ppc = []
for idx in range(nppc):
X_mask = np.random.multivariate_normal(means, cov, X_test.shape[0])
X_mod = X_mask * X_test.mask + X_test.data * (1 - X_test.mask)
point = trace[idx]
b = point['beta']
eps = point['sd']
est = np.matmul(X_mod, b)
y_pred = np.random.normal(est, eps)
ppc.append(y_pred)
However, the above code is wrong in multivariate case. You should NOT use a i.i.d. multivariate drop-in to replace the missing value in X_test
. In this case, the missing information is partially missing – for example, say one row in X
is [ma., 0.5, 1.2]
and X ~ MvNormal([0., 0., 0.], cov)
. Instead of generate X_ ~ MvNormal([0., 0., 0.], cov)
and do ma. = X_[0]
, you need to work out the conditional probability of ma.
by marginalizing the MvNormal using [.5, 1.2] and cov, and generate from this new conditional probability (also a Gaussian) instead.