Choosing between Matrix Normal and Multivariate Normal for Bayesian inference

I’m working on a Bayesian inference problem where I need to estimate a graph structure G with a spike-and-slab prior on each edge of G. My likelihood model is built on observed data R and covariance matrix S. I’ve been exploring both the Matrix Normal (MN) and Multivariate Normal (MVN) approaches for specifying the likelihood.

Here’s where my dilemma arises:

Matrix Normal (MN): Since R is D×D, using MN seems natural. I’ve factorized S, which is (D×D)x(D×D), into row and column covariance matrices U and V such that each dimension is DxD, and I can fit my likelihood as MN(R,U,V). However, this kronecker factorization from S to U and V is computationally expensive, and memory usage becomes an issue as D grows.

Multivariate Normal (MVN): Given that MN is equivalent to an MVN with vectorized mean and Kronecker-structured covariance U⊗V I could also theoretically consider fitting the data using an MVN with mean vec(R) and covariance S. This approach avoids the factorization step, which is appealing for large D. But, I’m concerned that I might lose some interpretability or structure that MN inherently provides and as S is high dimension, (D×D)x(D×D), I’m worried inferring the posterior would be challenging due to high dimension or memory issue.

Questions:

  1. I’m aware of the theoretical equivalence between MN and MVN through vectorization, so is using MVN instead of MN a valid alternative approach?
  2. If so, I’m trying to determine which approach would be more computationally feasible and structurally appropriate for my inference needs. Which one do you think would be a better likelihood?
  3. What would be a good inference methods? Would NUTS take too long and VI would be a better choice? I even saw normalizing flow is popular these days.