I’ve never really understood what eta does (I think I observed that my results were independent of the value I set there). So I’ve taken the occasion and tried to play around with this:
- I can reproduce this with the code provided by @monteiro.
- I also fiddled with the prior (
pm.Bound(pm.HalfFlat, lower=0.00001)
, then with upper bound > 3000) but still retrieve lower boundary eta - bounding eta to >1 makes it still converge to minimal values just above one (I thought \eta \approx \epsilon might be a lokal minimum when cutting off infinity for numeric reasons)
- When using an old example by Austin Rochford, which attaches a multivariate block of observations to the prior instead of just fitting the correlation matrix, eta (he called it
nu
) also takes very small values in posterior.test_eta2.py (1.4 KB) When leaving eta flexible, sampling is also not exactly fluent there, slowing down drastically at times, and chains fail late on.EDIT: file contained mistake, new try: test_eta2.py (1.5 KB) still a small eta.
Note that this last case could be also done with LKJCholeskyCov
, which also has an eta parameter. EDIT: when using this prior, the sampling is much faster test_eta3.py (1.8 KB). However, eta still remains close to zero.
The documentation in pymc3/distributions/multivariate
says:
For eta → oo the LKJ prior approaches the identity matrix.
So eta always tends to have a very small magnitude posterior. How come?
My understanding of the pymc3 code is not advanced enough to tell you what’s going on behind the scenes.
I guess this is something for the developers! Thanks for clearing it up.