Multivariate distribution of a n*m matrix

Hi everyone,

Apologies if the question is silly, trying to understand how the multivariate distributions work.

let’s put the case that I have a x matrix to model with dimension n \times m with a Multivariate Gaussian distribution. Is this possible in pymc?
How should be the covariance matrix defined? Is It a rank 4 tensor? And how could I write it in pymc?

If you can redirect me to a source where I can go into the detail of this, would be very appreciated.

Thanks all in advance,

I’d model \text{vec}(X), the n \times m vector that results from stacking the columns of X (x.T.ravel()), as a usual multivariate normal. You’d have to think about the structure of the covariance matrix, but it would be easier than trying to think about a higher dimensional random variable. You could reshape \text{vec}(X) back to X after you draw it to use in further computation.

1 Like

Thanks @jessegrabowski, i was thinking of doing a stack of the column vectors.

X=\begin{bmatrix} | & |& &| \\ x_1 & x_2 & . . .& x_m \\ | & | & & | \end{bmatrix} = [x_1, x_2, ..., x_m]

And I can have each vector as MvDistribution with each cov having dim n \times n, and then have a Mv distribution for the X with covariance m \times m. Do you think this may be correct?

Yeah that’s almost totally equivalent. The only difference is that you would be imposing a block-diagonal structure on the covariance matrix (that is, no correlation between the columns).

1 Like

I see, to not have this additional constraint, with the X stacked vertically with ravel as you suggest and with the covariance that is

X=\begin{bmatrix} x_1 \\ x_2 \\ ... \\ x_m \end{bmatrix}

\Sigma =\begin{bmatrix} \Sigma_{11} & \Sigma_{12} & &\Sigma_{1m} \\ \Sigma_{21} & \Sigma_{22} & . . .& \Sigma_{2m} \\ \Sigma_{m1}& \Sigma_{m2} & & \Sigma_{mm} \end{bmatrix}

and each \Sigma_{ij} is a n \times n covariance matrix itself

I think it makes sense, this second option may be way more intensive computationally, I think

Yeah estimating more stuff is always harder. It might make sense to constrain the covariances, but it’s impossible for me to say without knowing what the columns represent. OTOH it’s worth a try to see what you get if you ask for a LKJCholeskyCov with n=n*m

1 Like

Yes, I can try that.

The x are actually priority scales, so before, I was modelling them with Dirichlet distributions (sum to 1), actually, but each of them separately, with no correlation between them. I wanted to try to impose the hierarchical structure imposing them to covariate, if I get right how the hierarchical modelling works.

As you said, potentially including more constrain on the \Sigma_{ij} might help the model to converge faster… and the Dirichlet distributions might help in this?

this might mean that for i=j, the covariance is given by the cov of the Dirichlet distribution, and for i!=j the LKJCholeskyCov

You can’t (shouldn’t?) use a Dirchlet for the covariance because there’s no guarantee a draw from the Dirchlet will be positive semi-definite. If you still want all the x’s to sum to one you could draw the x vector, reshape it, then apply a softmax. LKJCholesky is basically always recommended for the covariance prior.

1 Like

@jessegrabowski thanks for the suggestion, this was actually my doubts and I was looking for alternatives. I will try this way