# Estimate Multivariate Normal or StudentT distribution parameters for three dimensional data

hi,

i acquired data using an MEG / EEG from a number of participants. for each participant, i have a matrix of data with the shape `(n_channels, n_samples)`, where `n_channels` is the number of electrodes or magnetometers i acquired the data from and `n_samples` are voltages or magnetic field strength that i measured every `x` ms (typically 1ms or 10ms)…

the data in the matrix are averaged data from a number of events that all happened at the same time. for instance, i know that at sample 20, an image was shown to the participants.

i thus concatenate the data over all participants, so i have a 3d matrix of the shape `(n_participants, n_channels, n_samples)`. What I want is an estimate of the mean at each of the samples and channels.

with the help of @jlindbloom, i came up with this model, that works quite nicely (the scale of the data is in the range of 1e-12, so i need the normalizing):

``````nu_min=2.5,
nu_mean=30

norm_factor = np.std(data)

with pm.Model() as model:
n_channels = data.shape[1]
n_samples = data.shape[2]

mu_normalized = pm.Normal('mu normalized', mu=0, sigma=1,
shape=(n_channels, n_samples))
mu = pm.Deterministic('mu', mu_normalized * norm_factor)
std_normalized = pm.Uniform('std normalized', lower=1e-1, upper=1e1,
shape=(n_channels, 1))
std = pm.Deterministic('std', std_normalized * norm_factor)
nu = pm.Exponential('nu', 1 / (nu_mean - nu_min),
shape=(n_channels, 1)) + nu_min

modeled_data = pm.StudentT('modeled', mu=mu, nu=nu, sigma=std,
observed=data)
``````

as i said, this works really well.

but the thing is: the model assumes that the covariance of the channels is diagonal. and this is not the case. the correlation is actually quite high and i would like to factor this into my model.

following a suggestion by @jlindbloom , i came up with this:

``````with pm.Model() as model:
n_channels = data.shape[1]
n_samples = data.shape[2]

mu_normalized = pm.Normal('mu normalized', mu=0, sigma=1,
shape=(n_channels, n_samples))
mu = pm.Deterministic('mu', mu_normalized * norm_factor)

chol_normalized, corr, stds_normalized = pm.LKJCholeskyCov(
'chol_normalized', n=2, eta=2.0, sd_dist=pm.Exponential.dist(1.0),
compute_corr=True
)

nu = pm.Exponential('nu', 1 / (nu_mean - nu_min),
shape=(n_channels, 1)) + nu_min

chol = pm.Deterministic('chol', chol_normalized * norm_factor)
stds = pm.Deterministic('stds', stds_normalized * norm_factor)
cov = pm.Deterministic('cov', chol.dot(chol.T))

modeled_data = pm.MvNormal('modeled', mu=mu, chol=chol,
observed=data, shape=(n_channels, n_samples))
``````

however, when i try to run this, i get the following error:

``````  File "home/bla/.venv/lib/python3.9/site-packages/pymc3/distributions/multivariate.py", line 116, in _quaddist
raise ValueError("Invalid dimension for value: %s" % value.ndim)
ValueError: Invalid dimension for value: 3
``````

i tried to pin down the error and it seems that, opposed to the `pm.StudenT` distribution, the multivariate ones do not allow a third dimension in the input.

how can i solve this? one possibility would be to fit the model separately for each channel. but i have the strong assumption that the covariance matrix is constant over time, so i would prefer to estimate for all my data.