Hi Oriol, thanks for the detailed reply!
You raise lots of good points, and I’m not disagreeing with you - I’ve no reason to suspect the matrix indexing is wrong per se. Will try out the subsampling bootstrap and report back.
I think I’ll have to create a synthetic dataset biased in ways I know precisely - and see what happens to that.
As to your conceptual questions, this could be my misunderstanding, but I’m slightly uneasy to see that the posterior lkjcc_corr seems to disagree (in sign) with the empirical correlation of the posteriors on b0.
Hopefully to draw analogy: in the simplest case of:
m1 = pm.Normal('m1', mu=0, sigma=1)
m2 = pm.Normal('m2', mu=m1, sigma=1, shape=3)
… I would expect the posterior mean of hyperparam m1 to represent the weighted mean of the posterior 3 levels in m2
I think this is analogous to:
chol, corr_, stds_ = pm.LKJCholeskyCov('lkjcc', n=3, eta=2.0,
sd_dist=sd_dist, compute_corr=True)
b0 = pm.MvNormal('b0', mu=0, chol=chol, dims='b0_names')
… since even though we’re not hierarchical on the mu of b0, the posterior lkjcc_corr is (indirectly via chol) fed into b0. If lkjcc_corr is not fixed to e.g. 0, but is free to move, is it really plausible that the lowest energy state for those posteriors would be a different sign to those of the MvN posteriors that they directly feed?
Perhaps a quick test will be to feed lkjcc_corr into b0. I’ll try that later on today