In Gelman & Hill, chapter 13 introduces the idea of allowing more than one regression coefficient to vary by group. They allow these coefficients to be correlated. In section 13.3, they introduce the scaled inverse-Wishart to model the covariance matrix of the the coefficients. In section 13.4, under the heading ‘Understanding correlations between group-level intercepts and slopes,’ they provide an example of correlated slope and intercept coefficients that they almost completely solve just be centering the data.

With that context, I realized that I’ve seen many models which don’t bother modeling the covariance between group-level predictors:

The Varying Intercept/Varying Slope version of the Radon model from the pymc3 docs

The Hierarchical Model of the Premier League from my blog

The Hierarchical Model of Six Nations Rugby, from the pymc3 docs

I’m trying to wrap my mind around the implications of *not* modeling the covariance.

If I understand correctly, by not modeling the covariance, we’re in effect using an infinitely strong prior on the coefficients being independent. But in the soccer and rugby examples, I’d actually expect the attacking and defending strengths to be strongly correlated, so I’m not properly encoding all my prior knowledge into the model. This could come back to bite me, if, say, it was early in the season and based on results so far, there is a team that appears to be very strong in attack but very weak on defense; the model wouldn’t be able to rely on a) my prior knowledge that this is unlikely and b) any evidence provided by the other teams in the league that attacking and defending strength usually go together.

Am I thinking about this correctly? To make it more concrete - what is the worst thing that could happen from _not_modeling the covariance of these coefficients?