Marginal distribution estimation

For example, I have a 100-dimensional normal distribution with a known covariance matrix.
But, for the location parameters (μ) each dimension is given by an independent Beta distribution.
How can I measure the marginal distribution of each dimension of the final distribution?

I am not sure I understand what you are asking for, so sorry if I only confuse you.

Analytically you get the marginal of the ith dimension by looking at a Gaussian with mean = mu[i] and sigma = cov[i,i], where mu and cov are the posterior mean and covariance parameters of your 100th dimensional gaussian. If you have samples from the posterior, you should be able to simply select trace["my_gaussian"][:, i] where my_gaussian is the name you gave to the 100th dimensional gaussian in your pymc3 model.

If this does not answer your question, maybe it would be helpful to have a snippet of your model to understand exactly what you are looking for.

Thank you!
I try to something like (Obviously, that is just pseudocode):

with pm.Model() as model1:
teta1 = pm.Beta(‘teta1’,1, 2)
teta2 = pm.Beta(‘teta2’,1, 2)
teta3 = pm.Beta(‘teta3’,1, 2)
teta4 = pm.Beta(‘teta4’,1, 2)

teta100 = pm.Beta(‘teta100’,1, 2)

mnorm = pm.MvNormal('mnorm', mu=[theta1,theta2,...,theta100], cov=cov)


mnorm.sample(10000)

Where parameters for each Beta are obtained by experiment.
I’m not sure that is exists easy solution to get that analytically.

It is still a bit unclear for me what you are trying to achieve.

Where does the data come in your model exactly? At the mnorm or at each of the thetas? What kind of data do you have? Unless you are just trying to sample from an already known model (in which case you don’t really need PyMC), you should have an observed argument somewhere in your model, where you input your data, in order to infer the posterior parameters (those without the observed argument).

To make inference, sampling would be called by trace = pm.sample() (still inside the with pm.Model() as model1: block), and not by mnorm.sample(10000) like you wrote in your pseudocode.

1 Like

For simplicity, you can use betas = pm.beta(1, 2, size=100) instead.

With that, as @ricardoV94 mentioned, you can define the observation distribution as mnorm = pm.MvNormal('mnorm', mu=betas, cov=cov, observed=obs) given an array of observations, obs. From there you can use pm.sample.

1 Like