Diagnosing Cholesky Corr in Trace Plot

I am trying to extend the Hierarchical Rugby example to use a MultiVariate, following the steps from the Primer on MultiLevel modelling. The idea is basically that the Attacking and Defending attributes for teams are probably correlated. My model looks something like this, with some unrelated code omitted for readability:

with pm.Model() as model:
    sd_dist = pm.Exponential.dist(0.5)
    chol, corr, stds = pm.LKJCholeskyCov("chol", n=2, eta=2.0, sd_dist=sd_dist, compute_corr=True)
    # Intercept for each attribute (i.e., Offense and Defense)
    attrs = pm.Normal('attrs', mu=0, sigma=1.0, shape=2)
    # Team specific variance from intercepts
    z = pm.Normal("z", 0.0, 1.0, dims=('attrs', 'team'))  # (2, num_teams)
    mv = pm.Deterministic("mv", tt.dot(chol, z).T, dims=('team', 'attrs'))  # (num_teams, attrs)

    # Expected values:
    # Team 1 Points
    team1_points_theta = tt.exp(attrs[0] + mv[team_1, 0] + mv[team_2, 1])
    team1_points = pm.Poisson('team1_points', mu=team1_points_theta, observed=team1_points_obs)

The inference process seems to work; I don’t get any divergences, and the trace plot for most of the variables looks reasonable. However, the ‘chol_corr’ part of the traceplot looks completely uninformed:

However, if I just plot an individual component of the cholesky corr using:
sns.distplot(trace.posterior.chol_corr[0][:, 1, 2])
I get a seemingly reasonable looking distribution of values.

Am I doing something wrong? In the tutorial on multi-level modelling, the self-correlations (the vertical line at 1) don’t even appear in the traceplot. I would like to be confident I am visualizing the data correctly, so I can be confident the model is working and implemented correctly before I continue to improve it.

Looking forward to hearing y’alls thoughts!

The plot is not very informative there because it is compacted and some dimension is in a very different range than the others.
You can also try plotting with compact=False for that variable alone.

1 Like

To build on Junpeng’s answer and make what he suggested even easier, you can use regexes with ArviZ functions:

    idata, var_names=r"_corr", filter_vars="regex", combined=True

or plot_trace, but if you’re interested in the correlations’ values, plot_forest will be easier to read.
Hope this helps :vulcan_salute:

1 Like

Thank you for the quick replies! Setting compact=False worked for me