I have a section of my code where I estimate a covariance matrix using the LKJCholeskyCov. However, this is making me really confused as to what randomly distributed variables in my model get a proper “name” and get explicitly sampled in my posterior chain.
Let me show what I mean. Here is an example of a section of my code:
# Priors for error covariance matrix
sd_dist = pm.HalfNormal.dist(sd=1, shape=num_dep_vars)
packed_chol = pm.LKJCholeskyCov('chol_cov', eta=2, n=num_dep_vars, sd_dist=sd_dist)
chol = pm.expand_packed_triangular(num_dep_vars, packed_chol, lower=True)
# Generate covariance matrix from Cholesky factor
cov_mtx = tt.dot(chol, chol.T)
cov_terms = pm.Deterministic('cov_terms',cov_mtx[cov_term_indices])
# Generate the standard deviations from the cholesky factor
sd = pm.Deterministic('sd',tt.sqrt(tt.diag(cov_mtx)))
# Tensor manipulation used to extract the correlation terms
cor_mtx = tt.diag(sd**-1).dot(cov_mtx).dot(tt.diag(sd**-1))
cor_terms = pm.Deterministic('cor_terms',cor_mtx[cor_term_indices])
So I first create a half-normally distributed set of standard deviations, build up a vector of cholesky factors, build up a covariance matrix. Then, i start “going backwards”, extracting the variances, standard deviations and finally, the actual correlations.
But why do I have to do this to get to the standard deviations? Aren’t I already calculating them at the top of the code, when I say sd_dist = pm.HalfNormal.dist(sd=1, shape=num_dep_vars)? Why isn’t sd_dist being “watched” as a variable to be sampled from in the posterior, as is the packed_chol variable? Is it because sd_dist isn’t named? Also, why can I not give sd_dist a name?
I’m just getting a bit confused here with what gets a proper name and what gets sampled from in my posterior chains.
LKJCholeskyCov is defined based on the LKJ Correlation matrix, which does not contain information of the variance (diagonal being 1). Here we give it a distribution so that the diagonal elements could be evaluated on.
I think you can do below to keep track the sd, but not sure - please verify yourself.
My question is: why does sd_dist look like a distribution while sd and chol_cov look like actual parameters? I’ve defined all three of these things so similarly in my model. Why are they behaving so differently?
Hmm I was wrong,sd_dist is just a free node on the graph and the sample you see is from the prior. LKJCholeskyCov is evaluating the logp using the log-likelihood but the RV sd_dist is not actually connect to any other variables in the model.
In that case, there probably no easier way to log the sd_dist into the trace than what you are already doing with the deterministic.
Thanks for the clarification, @junpenglao! But my question still remains: if I defined sd_dist just like the other variables, why is PYMC drawing from it’s prior instead of its posterior?
I think i’m missing something super basic and silly here.
Because the sd_dist is never actually used in the model, you can think of it as a potential function, which is evaluated and added to the model logp, but will not leave samples in the MCMC chain (in another word, it is not a free parameter that other parameters depends on)