Conditional multivariate sampling

Hello,

I fitted a multivariate normal distribution over a rhee dimension matrix of random variables using the following code:

model = pm.Model()
shape = x.shape[1] # shape of x is [N, 3]

with model:
    packed_L = pm.LKJCholeskyCov('packed_L', n=shape,
                                 eta=2., sd_dist=pm.HalfCauchy.dist(2.5))
    L = pm.expand_packed_triangular(shape, packed_L)
    Σ = pm.Deterministic('Σ', L.dot(L.T))
    
    μ = pm.Normal('μ', 0., 10., shape=shape)
    
    obs = pm.MvNormal('obs', mu=μ, chol=L, observed=x)
    trace_ = pm.sample(5000, cores=1)

I would like to fix one variable of the distribution to a scalar value and have the resulting conditional distribution of the two remaining variables. How can that be achieved?

Thank you very much for your help.

Maxime.

2 Likes

You mean like fixing μ[0] = scalar with the observed being the same observed=x? or you would like to marginalized the MvNormal to have the observed=x[:, 1:]?

Thank you for your answer.

Let’s say that X1 is a random variable describing a person’s age, X2 is the annual revenue and X3 represents this person’s number of children.

The objective of the first code snippet was to model these variables with a multivariate normal distribution. Now, I would like to have the normal multivariate distribution of say X1 and X2 given that X3=3 children. Would that be possible ?

x = [X1, X2, X3]

Thank you in advance,

Maxime.

1 Like

Since [X1, X2, X3] are some observed quantities, I would either model x_tilt = [X1[X3==3], X2[X3==3]] and find the posterior of x_tilt; or model X = [X1, X2, X3] directly and find the posterior of X, then compute the conditional of this posterior when X3==3.

The later cases you can use the code you have, and use sample_ppc after inference to generate large among of ppc samples, and plot the conditional:

index = (ppc['obs'][:, 2] > 2.99)  or (ppc['obs'][:, 2] < 3.01)
scatter(ppc['obs'][index, 0], ppc['obs'][index, 1])
1 Like

As X3 represents a continuous variable in my model, I will use the second method.

Thank you for your help.

Maxime.

FYI, using the second approach you can also compute the posterior conditional using the posterior sample mu and cov of the MvNormal from the trace: https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_distributions

1 Like