Cholesky decomposition and correlation among random effects


#21

I’m digging this up because it was referenced here: https://discourse.pymc.io/t/random-intercepts-and-slopes-correlated-convergence-issues/2865/15. I suspect that the issue has to do with the conflict between:

and

This to me sounded like a Simpson’s paradox. Having dug up the notebook, it seems that this is the case. Plotting the posteriors of model Random effect on the intercepts and slopes gives strongly anti-correlated within-subject estimates:

image

but looking across samples

tis = trace_intercept_slope
for j in range(3):
    sns.scatterplot(tis['gamma_Z_slope'][:, j], tis['gamma_Z_intercept'][:, j], alpha=0.2)
    plt.figure()
    
colors = ['red', 'black', 'orange', 'yellow', 'green', 'blue', 'purple', 'magenta']
all_vals = [[],[]]
for j in range(tis['gamma_Z_slope'].shape[1]):
    sns.scatterplot(tis['gamma_Z_slope'][:150, j],
                    tis['gamma_Z_intercept'][:150, j],
                    color=colors[j % len(colors)],
                    alpha=0.2)
    all_vals[0].extend(tis['gamma_Z_slope'][:150, j])
    all_vals[1].extend(tis['gamma_Z_intercept'][:150, j])

np.cov(np.array(all_vals))

image

Or using a bivariate normal approximation:

image

The overall covariance (using the posterior samples all_vals) between slope and intercept is

\mathbf{\Sigma} = \left(\begin{array}{cc} 36.7 & 3.5 \\ 3.5 & 645.2\end{array}\right)

which lines up with