Multivariate distribution of a n*m matrix

Raimondo_Mancinelli · June 21, 2023, 1:05pm

Hi everyone,

Apologies if the question is silly, trying to understand how the multivariate distributions work.

let’s put the case that I have a x matrix to model with dimension n \times m with a Multivariate Gaussian distribution. Is this possible in pymc?
How should be the covariance matrix defined? Is It a rank 4 tensor? And how could I write it in pymc?

If you can redirect me to a source where I can go into the detail of this, would be very appreciated.

Thanks all in advance,
Ray

jessegrabowski · June 21, 2023, 1:13pm

I’d model \text{vec}(X), the n \times m vector that results from stacking the columns of X (x.T.ravel()), as a usual multivariate normal. You’d have to think about the structure of the covariance matrix, but it would be easier than trying to think about a higher dimensional random variable. You could reshape \text{vec}(X) back to X after you draw it to use in further computation.

Raimondo_Mancinelli · June 21, 2023, 1:29pm

Thanks @jessegrabowski, i was thinking of doing a stack of the column vectors.

X=\begin{bmatrix} | & |& &| \\ x_1 & x_2 & . . .& x_m \\ | & | & & | \end{bmatrix} = [x_1, x_2, ..., x_m]

And I can have each vector as MvDistribution with each cov having dim n \times n, and then have a Mv distribution for the X with covariance m \times m. Do you think this may be correct?

jessegrabowski · June 21, 2023, 1:49pm

Yeah that’s almost totally equivalent. The only difference is that you would be imposing a block-diagonal structure on the covariance matrix (that is, no correlation between the columns).

Raimondo_Mancinelli · June 21, 2023, 1:52pm

I see, to not have this additional constraint, with the X stacked vertically with ravel as you suggest and with the covariance that is

X=\begin{bmatrix} x_1 \\ x_2 \\ ... \\ x_m \end{bmatrix}

\Sigma =\begin{bmatrix} \Sigma_{11} & \Sigma_{12} & &\Sigma_{1m} \\ \Sigma_{21} & \Sigma_{22} & . . .& \Sigma_{2m} \\ \Sigma_{m1}& \Sigma_{m2} & & \Sigma_{mm} \end{bmatrix}

and each \Sigma_{ij} is a n \times n covariance matrix itself

I think it makes sense, this second option may be way more intensive computationally, I think

jessegrabowski · June 21, 2023, 2:27pm

Yeah estimating more stuff is always harder. It might make sense to constrain the covariances, but it’s impossible for me to say without knowing what the columns represent. OTOH it’s worth a try to see what you get if you ask for a LKJCholeskyCov with n=n*m

Raimondo_Mancinelli · June 21, 2023, 2:43pm

Yes, I can try that.

The x are actually priority scales, so before, I was modelling them with Dirichlet distributions (sum to 1), actually, but each of them separately, with no correlation between them. I wanted to try to impose the hierarchical structure imposing them to covariate, if I get right how the hierarchical modelling works.

As you said, potentially including more constrain on the \Sigma_{ij} might help the model to converge faster… and the Dirichlet distributions might help in this?

this might mean that for i=j, the covariance is given by the cov of the Dirichlet distribution, and for i!=j the LKJCholeskyCov

jessegrabowski · June 21, 2023, 3:02pm

You can’t (shouldn’t?) use a Dirchlet for the covariance because there’s no guarantee a draw from the Dirchlet will be positive semi-definite. If you still want all the x’s to sum to one you could draw the x vector, reshape it, then apply a softmax. LKJCholesky is basically always recommended for the covariance prior.

Raimondo_Mancinelli · June 21, 2023, 3:06pm

@jessegrabowski thanks for the suggestion, this was actually my doubts and I was looking for alternatives. I will try this way

Topic		Replies	Views
Multivariate normal with diagonal covariance Questions	5	3055	March 20, 2019
About set of Multivariate Normal Distributions Questions	2	1168	July 3, 2018
How to marginalize on a Multivariate Normal distribution Questions	13	8094	November 23, 2018
Modelling multivariate distributions with a sparse covariance structure v5 modeling	3	532	December 13, 2022
About settings in MvNormal() Questions	3	584	January 11, 2022

Multivariate distribution of a n*m matrix

Related topics