How to use the multivariate normal distribution (x,y,z) by pymc3 and calculate the correlation between each variable with f(x,y,z) ?
in my problem have three variables. they are prior uniform distribution. and I have function f(x,y,z). Because I don’t know how to use the multivariate normal distribution so my code that does not meet my requirements. ( in my code temporarily use the pm.Normal to find the posterior distribution but it is not correctly) and I am using prior Normal distribution for 3 variables to have bell curve for posterior distribution.
basic_model = pm.Model()
nodes1 = pm.Normal(‘nodes1’, 0.05, 0.06)
nodes2 = pm.Normal(‘nodes2’, 1.0, 0.1)
nodes3 = pm.Normal(‘nodes3’, 50, 5)
pm.Normal(‘observed’, mu, 0.05, observed=evals)
trace = pm.sample(1000, cores =1)
k = pm.summary(trace).round(2)
pm.plot_posterior (trace, varnames= [‘nodes1’,‘nodes2’,‘nodes3’])
tracedf1 = pm.trace_to_dataframe (trace, varnames = [‘nodes1’,‘nodes2’,‘nodes3’])
To answer your first question, how to use the
MvNormal distribution. Mathematically the multivariate normal’s parameters are its mean (which is a vector) and its covariance matrix. Following this, to create an
MvNormal you have to pass a vector as the mean (these can be other RVs) and some representation of the covariance matrix. What I mean by this is that given that the covariance matrix is positive semidefinite and symmetric, it’s a bit redundant to represent it with the full matrix of NxN numbers. One could choose to represent the covariance matrix with its cholesky decomposition, or even with its inverse (which is sometimes called precision). Thus there are three ways pymc3 allows you to represent the covariance matrix:
- The full covariance matrix (you supply this parametrization with the kwarg
- The precision, i.e. the inverse of the covariance matrix (you supply this parametrization with the kwarg
- The cholesky decomposition of the covariance matrix (you supply this parametrization with the kwarg
Before going on with the multivariate normal I’ll deviate a bit into LKJ. It’s easy to write up vector random variables for the mean. Something that is harder is to write a distribution that represents covariance matrices. The
LKJCorr distribution does almost that, it’s a distribution over correlation matrices (the diagonal elements are all equal to one). To get the covariance matrix you just need to multiply the correlation matrix with a diagonal matrix of the
sigma**2 (which can be built from a vector RV representing
sigma). One can also use the
LKJCholeskyCov distribution, that works like the
LKJCorr but gives control for the
sigma distribution and also returns the cholesky decomposition of the covariance matrix.
LKJCholeskyCov in hand, you could specify your
mu0 = pm Normal('mu0', [0.05, 1, 50], [0.06, 0.1, 0.5], shape=3)
sd_dist = pm.HalfCauchy.dist(beta=2.5, shape=3)
chol_packed = pm.LKJCholeskyCov('chol_packed',
n=3, eta=2, sd_dist=sd_dist)
chol = pm.expand_packed_triangular(3, chol_packed)
nodes = pm.MvNormal('nodes', mu=mu0, chol=chol, shape=3)
You can keep the rest of your model parametrization, but now the prior over the nodes has some infereable correlation.
thank you reply to me.
but perhaps the command line it is not working
mu0 = pm.Normal(‘mu0’, [0.05, 1, 50], [0.06, 0.1, 0.5], shape=3)
when I am trying to with this command it have a error " unsupported operand type (s) for ** or pow():‘list’ and ‘float’. Do you know fix it?
Ah, sorry. You have to pass in
numpy.array's instead of plain
list's. Try this:
mu0 = pm.Normal(‘mu0’, np.array([0.05, 1, 50]), np.array([0.06, 0.1, 0.5]), shape=3)
Thank you help me:
you can explain for me. When I run this code what is the nodes__0,nodes_1, nodes__2 they are the likelihood data of the my problem. if it is not a likelihood so what is the likelihood in this code?