Estimate Multivariate Normal or StudentT distribution parameters for three dimensional data

hi,

i acquired data using an MEG / EEG from a number of participants. for each participant, i have a matrix of data with the shape (n_channels, n_samples), where n_channels is the number of electrodes or magnetometers i acquired the data from and n_samples are voltages or magnetic field strength that i measured every x ms (typically 1ms or 10ms)…

the data in the matrix are averaged data from a number of events that all happened at the same time. for instance, i know that at sample 20, an image was shown to the participants.

i thus concatenate the data over all participants, so i have a 3d matrix of the shape (n_participants, n_channels, n_samples). What I want is an estimate of the mean at each of the samples and channels.

with the help of @jlindbloom, i came up with this model, that works quite nicely (the scale of the data is in the range of 1e-12, so i need the normalizing):

nu_min=2.5,
nu_mean=30

norm_factor = np.std(data)

with pm.Model() as model:
        n_channels = data.shape[1]
        n_samples = data.shape[2]

        mu_normalized = pm.Normal('mu normalized', mu=0, sigma=1,
                                  shape=(n_channels, n_samples))
        mu = pm.Deterministic('mu', mu_normalized * norm_factor)
        std_normalized = pm.Uniform('std normalized', lower=1e-1, upper=1e1,
                                    shape=(n_channels, 1))
        std = pm.Deterministic('std', std_normalized * norm_factor)
        nu = pm.Exponential('nu', 1 / (nu_mean - nu_min),
                            shape=(n_channels, 1)) + nu_min

        modeled_data = pm.StudentT('modeled', mu=mu, nu=nu, sigma=std,
                                   observed=data)

as i said, this works really well.

but the thing is: the model assumes that the covariance of the channels is diagonal. and this is not the case. the correlation is actually quite high and i would like to factor this into my model.

following a suggestion by @jlindbloom , i came up with this:

with pm.Model() as model:
        n_channels = data.shape[1]
        n_samples = data.shape[2]

        mu_normalized = pm.Normal('mu normalized', mu=0, sigma=1,
                                  shape=(n_channels, n_samples))
        mu = pm.Deterministic('mu', mu_normalized * norm_factor)

        chol_normalized, corr, stds_normalized = pm.LKJCholeskyCov(
            'chol_normalized', n=2, eta=2.0, sd_dist=pm.Exponential.dist(1.0),
            compute_corr=True
        )

        nu = pm.Exponential('nu', 1 / (nu_mean - nu_min),
                            shape=(n_channels, 1)) + nu_min

        chol = pm.Deterministic('chol', chol_normalized * norm_factor)
        stds = pm.Deterministic('stds', stds_normalized * norm_factor)
        cov = pm.Deterministic('cov', chol.dot(chol.T))

        modeled_data = pm.MvNormal('modeled', mu=mu, chol=chol,
                                     observed=data, shape=(n_channels, n_samples))

however, when i try to run this, i get the following error:

  File "home/bla/.venv/lib/python3.9/site-packages/pymc3/distributions/multivariate.py", line 116, in _quaddist
    raise ValueError("Invalid dimension for value: %s" % value.ndim)
ValueError: Invalid dimension for value: 3

i tried to pin down the error and it seems that, opposed to the pm.StudenT distribution, the multivariate ones do not allow a third dimension in the input.

how can i solve this? one possibility would be to fit the model separately for each channel. but i have the strong assumption that the covariance matrix is constant over time, so i would prefer to estimate for all my data.

thanks a lot in advance!
thomas