I’m digging this up because it was referenced here: https://discourse.pymc.io/t/random-intercepts-and-slopes-correlated-convergence-issues/2865/15
. I suspect that the issue has to do with the conflict between:
and
This to me sounded like a Simpson’s paradox. Having dug up the notebook, it seems that this is the case. Plotting the posteriors of model Random effect on the intercepts and slopes
gives strongly anti-correlated within-subject estimates:
but looking across samples
tis = trace_intercept_slope
for j in range(3):
sns.scatterplot(tis['gamma_Z_slope'][:, j], tis['gamma_Z_intercept'][:, j], alpha=0.2)
plt.figure()
colors = ['red', 'black', 'orange', 'yellow', 'green', 'blue', 'purple', 'magenta']
all_vals = [[],[]]
for j in range(tis['gamma_Z_slope'].shape[1]):
sns.scatterplot(tis['gamma_Z_slope'][:150, j],
tis['gamma_Z_intercept'][:150, j],
color=colors[j % len(colors)],
alpha=0.2)
all_vals[0].extend(tis['gamma_Z_slope'][:150, j])
all_vals[1].extend(tis['gamma_Z_intercept'][:150, j])
np.cov(np.array(all_vals))
Or using a bivariate normal approximation:
The overall covariance (using the posterior samples all_vals
) between slope and intercept is
\mathbf{\Sigma} = \left(\begin{array}{cc} 36.7 & 3.5 \\ 3.5 & 645.2\end{array}\right)
which lines up with