How to marginalize on a Multivariate Normal distribution

aayushy · November 22, 2018, 3:11pm

I am new to both probabilistic modeling and PyMC so please bear with me.

I have a model that learns a Multivariate normal distribution and now I’d like to find the probabilities over a partial test input.

My problem is very similar to the one described in this question. Specifically, I refer to the final comment quoted as below:

you need to work out the conditional probability of ma. by marginalizing the MvNormal using [.5, 1.2] and cov, and generate from this new conditional probability (also a Gaussian) instead.

How can I achieve this with PyMC?

Further, is it also possible to add this new observation and update the posterior of the Multivariate distribution?

junpenglao · November 22, 2018, 4:11pm

Seems it is actually much easier than I thought(!) From Wikipedia:

To obtain the marginal distribution over a subset of multivariate normal random variables, one only needs to drop the irrelevant variables (the variables that one wants to marginalize out) from the mean vector and the covariance matrix. The proof for this follows from the definitions of multivariate normal distributions and linear algebra.[18]

Example

Let X = [ X 1, X 2, X 3] be multivariate normal random variables with mean vector μ = [ μ 1, μ 2, μ 3] and covariance matrix Σ (standard parametrization for multivariate normal distributions). Then the joint distribution of X′ = [ X 1, X 3] is multivariate normal with mean vector μ′ = [ μ 1, μ 3] and covariance matrix $oldsymboligma' = egin{bmatrix} oldsymboligma_{11} oldsymboligma_{13} oldsymboligma_{31} oldsymboligma_{33} nd{bmatrix}$

And yes, it is also possible to add this new observation to the model, something like pm.MvNormal('marg', mu[0, 2], cov[[0, 2], :][:, [0, 2]], observed = x02)

aayushy · November 22, 2018, 5:17pm

Thank you for your reply. However, it is still not clear to me how I might do this with PyMC.

Also, what I am really looking for is something like P(X1|X2=p) where the distribution is over X = [X1, X2, X3]. That is, I want to find the distribution over X1 when I only know that X2 is equal to some arbitrary p.

aayushy · November 23, 2018, 7:51am

Also, I don’t think indexing with mu[0, 2] is allowed. It throws IndexError: too many indices for array

lucianopaz · November 23, 2018, 8:48am

You should change it to mu[[0, 2]] instead.

junpenglao · November 23, 2018, 8:54am

For conditional distribution see: https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_distributions

So I would say you should first drop the irrelevant variables of X = [X1, X2, X3] to get the marginal distribution X’ = [X1, X2], and then compute the conditional P(X1|X2=p).

aayushy · November 23, 2018, 9:10am

Thanks, I was in fact reading the same article.

So say of I had a distribution over X = [X0, X1, X2, X3, X4] and my test input was [--, --, 0.5, 0.6, --], to find P(X0|X2=0.5, X3=0.6), I would do the following:

Marginalize over X’ = [X0, X2, X3] to get a distribution with parameters μ’ and Σ’.
The parameters for the required conditional will then be found using the method as described in the linked Wikipedia article.
Now I plug these parameters into pm.MvNormal and sample.

Have I understood this correctly?

aayushy · November 23, 2018, 9:53am

On a related note, finding the covariance matrix requires inverting Σ_22. Can that be achieved directly with PyMC?

Some example code I’ve worked out (inverse() is not implemented):

new_data = np.array([0.5, 0.0, 0.6, 0.0, 0.0])
mask = np.where(new_data != 0)[0]
infer = 1

with model:
    _mu = mu[[infer]] + sigma[[infer], :][:, mask].dot(inverse(sigma[mask, :][:, mask]).dot(new_data[mask] - mu[mask]))
    _sigma = sigma[[infer], :][:, [infer]] - sigma[[infer], :][:, mask].dot(inverse(sigma[mask, :][:, mask]).dot(sigma[mask, :][:, [infer]]))

aayushy · November 23, 2018, 2:33pm

So I noticed something strange while trying to write my model (which isn’t correct as yet and I would really appreciate your inputs), I’ve posted my notebook in this gist.

What’s happening is when I run the line with model: trace2 = pm.sample(50, njobs=4) I get a shape mismatch error

~/.local/lib/python3.5/site-packages/pymc3/sampling.py in init_nuts(init, chains, n_init, model, random_seed, progressbar, **kwargs)
   1426         var = np.ones_like(mean)
   1427         potential = quadpotential.QuadPotentialDiagAdapt(
-> 1428             model.ndim, mean, var, 10)
   1429     elif init == 'advi+adapt_diag_grad':
   1430         approx = pm.fit(

~/.local/lib/python3.5/site-packages/pymc3/step_methods/hmc/quadpotential.py in __init__(self, n, initial_mean, initial_diag, initial_weight, adaptation_window, dtype)
    119         if initial_diag is not None and len(initial_diag) != n:
    120             raise ValueError('Wrong shape for initial_diag: expected %s got %s'
--> 121                              % (n, len(initial_diag)))
    122         if len(initial_mean) != n:
    123             raise ValueError('Wrong shape for initial_mean: expected %s got %s'

ValueError: Wrong shape for initial_diag: expected 77 got 76

(The one extra dimension corresponds to the conditional)

To check what was being passed to model.dict_to_array(vals), I printed start.keys() before line 1425 in sampling.py

Oddly enough the error went away and I was able to run the inference without any trouble, as you will see in the notebook.

cc @junpenglao @lucianopaz (as you’re both devs)

lucianopaz · November 23, 2018, 3:20pm

I’m away from my pc so I just looked through your gist but did not run it. You’re doing some weird indexing on your arrays. Mask is a 2d array of integers, but sometimes you use it as Boolean mask and other times you are doing something like sigma[mask, :] which will get integer indexing from the first 2x2 square. The resulting array would be just a vector, if I’m not mistaken, because the colon is not interpreted as a new axis identifier.
Maybe you could take a quick glance at the numpy basic indexing to make sure that what you are writing is doing what you want it to be doing.

junpenglao · November 23, 2018, 3:44pm

To find the conditional and the marginal you need the full information of the covariance matrix, which I am assuming that you are trying to inference from the data. So here is what I would do it:

Build model to find \mu and \Sigma
Sample from the posterior
Using the posterior sample to build conditional posterior, and generate posterior predictive sample from the conditional posterior. You can sample from the join conditional P(X0, X1, X4|X2=0.5, X3=0.6) and keep only the X0.

I put up a small notebook as demonstration: https://github.com/junpenglao/Planet_Sakaar_Data_Science/blob/master/PyMC3QnA/discourse_2265.ipynb

aayushy · November 23, 2018, 5:01pm

There is no boolean indexing here, mask is actually not a mask (my bad, I know), it’s just a list of indices we want to get the marginal on. Although, I do agree that it is a huge mess that’s difficult to understand.

Thanks for you help

aayushy · November 23, 2018, 5:11pm

This is great, thank you! I’ve had trouble wrapping my head around PyMC but I think I’m starting to get the hang of it.

I noticed that the second time, when sampling from the conditional, you haven’t used pm.sample, why is that?

junpenglao · November 23, 2018, 8:31pm

pm.sample is for sampling from the posterior, here it is posterior predictive sample, which based on the sampled parameters output from pm.sample. Using scipy to sample is easier (even internally, sample_posterior_predictive is also mostly relying on scipy).

Topic		Replies	Views
Multivariate normal with changing covariance over observations version agnostic modeling	0	448	July 20, 2022
Dealing with partially observed multivariate Normal values Questions	3	1659	January 7, 2019
Multivariate normal with diagonal covariance Questions	5	3053	March 20, 2019
About settings in MvNormal() Questions	3	584	January 11, 2022
Multivariate Normal with missing inputs Questions	9	1510	December 27, 2018

How to marginalize on a Multivariate Normal distribution

Related topics