Differences of using pm.Normal and pm.MvNormal with diagonal cov

Hi everyone,
I am currently working of Bayesian calibration of models and I can’t manage to explain myself the difference between defining my likelihood these 2 ways :

  1. pm.Normal(‘llk’, mu, sigma, observed) where both are nx1 vectors
  2. pm.MvNormal(‘llk’, mu, cov, observed) where cov is a diagonal matrix
    Note that my and sigma are computed using a hierarchical model :slight_smile:
    My guess is if I don’t get the same result is because 1) is trying to minimise each likelihood points and 2) minimises only the product of my independent likelihoods but I am not sure because I can’t manage to understand how pm.MvNormal works.
    Let me explain : :slight_smile:
    After sampling I’m interested in computing the marginal likelihood so I wanted to use the log likelihood computed for every sample. But instead of having a log likelihood for the pm.MvNormal likelihood with shape (chains, sample,1) I get a shape (chains, sample,n) where n is the number of observations and I don’t understand why.

I hope I am clear enough in my explanations, and hope someone has an answer :sweat_smile:
Thanx so much

The last axis contains the related variables of a MvNornal, so a 1x100 mvnornal reflects a single vector draw. A 2x100 MvNornal would reflect two independent draws, each of which is a 100-long vector whose elements are related to each other (not statistically in this case because of the diagonal covariance).

The logp of a single mvnornal is a scalar regardless of how “long” it is. For a 1x100 you get something with shape [1], for a 2x100 you get something with shape [2]