Not sure about the effects of replacing 85% of my missingness with imputed mean - a) that’s just way too much imputation, and b) single imputation is soooo un-Bayesian…
However, I’ve been looking into Multiple Imputation recently, and this book seems pretty good:
FancyImpute supports most of the single imputation strategies, but it seems its MICE algorithm only supports ordinal data.
For Bayesian PCA in PyMC3, I am a bit worried we are going to run into exactly the same problem with FA - if the dataset is large enough, we simply can’t specify a matrix U = Normal(‘factor_loadings’, mu=mu_prior, tau=tau_prior, shape=(N, K)) with large N.
Another thing I wanted to mention is the mixture of a dirac and a normal would result in a “spike and slab” distribution. This is particularly problematic in the Indian Buffet Process (IBP), where we also have a Bernoulli “mask” matrix that needs to have the Bernoulli latents marginalized away. Unlike my missing value mixture, where it is observed (thus no inference needed), this Beroulli mask for IBP is completely hidden - which means its posterior needs to be inferred. However, it is quite unlikely that ADVI could come up with a reasonable approximation for an “uglily shaped” spike-and-slab posterior.
Cheers,
Hugo