I am trying to extend the Hierarchical Rugby example to use a MultiVariate, following the steps from the Primer on MultiLevel modelling. The idea is basically that the Attacking and Defending attributes for teams are probably correlated. My model looks something like this, with some unrelated code omitted for readability:
with pm.Model() as model: sd_dist = pm.Exponential.dist(0.5) chol, corr, stds = pm.LKJCholeskyCov("chol", n=2, eta=2.0, sd_dist=sd_dist, compute_corr=True) # Intercept for each attribute (i.e., Offense and Defense) attrs = pm.Normal('attrs', mu=0, sigma=1.0, shape=2) # Team specific variance from intercepts z = pm.Normal("z", 0.0, 1.0, dims=('attrs', 'team')) # (2, num_teams) mv = pm.Deterministic("mv", tt.dot(chol, z).T, dims=('team', 'attrs')) # (num_teams, attrs) # Expected values: # Team 1 Points team1_points_theta = tt.exp(attrs + mv[team_1, 0] + mv[team_2, 1]) team1_points = pm.Poisson('team1_points', mu=team1_points_theta, observed=team1_points_obs)
The inference process seems to work; I don’t get any divergences, and the trace plot for most of the variables looks reasonable. However, the ‘chol_corr’ part of the traceplot looks completely uninformed:
However, if I just plot an individual component of the cholesky corr using:
sns.distplot(trace.posterior.chol_corr[:, 1, 2])
I get a seemingly reasonable looking distribution of values.
Am I doing something wrong? In the tutorial on multi-level modelling, the self-correlations (the vertical line at 1) don’t even appear in the traceplot. I would like to be confident I am visualizing the data correctly, so I can be confident the model is working and implemented correctly before I continue to improve it.
Looking forward to hearing y’alls thoughts!