Hi everyone,
I looked at the source code and the existing examples for setting the sd_dist parameter in LKJCholeskyCov and it seems ambiguous.
Here: https://docs.pymc.io/api/distributions/multivariate.html#pymc3.distributions.multivariate.LKJCholeskyCov
sd_dist = pm.HalfCauchy.dist(beta=2.5)
packed_chol = pm.LKJCholeskyCov('chol_cov', eta=2, n=10, sd_dist=sd_dist)
sd_dist shape is left empty, but n is equal to 10.
Here, however: https://docs.pymc.io/api/distributions/multivariate.html
sd_dist = pm.HalfCauchy.dist(beta=2.5, shape=3)
chol_packed = pm.LKJCholeskyCov('chol_packed',
n=3, eta=2, sd_dist=sd_dist)
It’s slightly unclear to me why shape is 3: assuming this represents a more flexible model where each entry gets its own random variable, then surely the shape shouldn’t be 3 but it should be 3+2+1 (ie, the number of entries for a lower triangular matrix).
Actually sd_dist is only apply to the diagonal (since LKJ is a correlation matrix, we need to add variance parameter to the diagonal), that’s why in the second case it has shape=3
. But you are right the doc should be more clear - PR welcome
I’d be glad to do a PR, but I guess I didn’t understand exactly the LKJ formulation then. In the example above, my impression was that sd_dist was the distribution over the individual entries of the cholesky (which is of dimension 3+2+1).
If sd_dist is only applied to the diagonal, what’s the prior over the distribution of the off diagonal elements? Is that what is determined by eta?
Finally - am I right in my intuition that sd_dist = pm.HalfCauchy.dist(beta=2.5, shape=3) corresponds to parametrizing each element of the diagonal with its own value, while sd_dist = pm.HalfCauchy.dist(beta=2.5) corresponds to tying together the elements of the diagonal to a shared value?