Diagnosing Cholesky Corr in Trace Plot

bglick13 · October 2, 2020, 1:31pm

I am trying to extend the Hierarchical Rugby example to use a MultiVariate, following the steps from the Primer on MultiLevel modelling. The idea is basically that the Attacking and Defending attributes for teams are probably correlated. My model looks something like this, with some unrelated code omitted for readability:

with pm.Model() as model:
    sd_dist = pm.Exponential.dist(0.5)
    chol, corr, stds = pm.LKJCholeskyCov("chol", n=2, eta=2.0, sd_dist=sd_dist, compute_corr=True)
    # Intercept for each attribute (i.e., Offense and Defense)
    attrs = pm.Normal('attrs', mu=0, sigma=1.0, shape=2)
    
    # Team specific variance from intercepts
    z = pm.Normal("z", 0.0, 1.0, dims=('attrs', 'team'))  # (2, num_teams)
    mv = pm.Deterministic("mv", tt.dot(chol, z).T, dims=('team', 'attrs'))  # (num_teams, attrs)

    # Expected values:
    # Team 1 Points
    team1_points_theta = tt.exp(attrs[0] + mv[team_1, 0] + mv[team_2, 1])
    team1_points = pm.Poisson('team1_points', mu=team1_points_theta, observed=team1_points_obs)

The inference process seems to work; I don’t get any divergences, and the trace plot for most of the variables looks reasonable. However, the ‘chol_corr’ part of the traceplot looks completely uninformed:

However, if I just plot an individual component of the cholesky corr using:
sns.distplot(trace.posterior.chol_corr[0][:, 1, 2])
I get a seemingly reasonable looking distribution of values.

Am I doing something wrong? In the tutorial on multi-level modelling, the self-correlations (the vertical line at 1) don’t even appear in the traceplot. I would like to be confident I am visualizing the data correctly, so I can be confident the model is working and implemented correctly before I continue to improve it.

Looking forward to hearing y’alls thoughts!

junpenglao · October 3, 2020, 7:50am

The plot is not very informative there because it is compacted and some dimension is in a very different range than the others.
You can also try plotting with compact=False for that variable alone.

AlexAndorra · October 5, 2020, 4:58pm

To build on Junpeng’s answer and make what he suggested even easier, you can use regexes with ArviZ functions:

az.plot_forest(
    idata, var_names=r"_corr", filter_vars="regex", combined=True
);

or plot_trace, but if you’re interested in the correlations’ values, plot_forest will be easier to read.
Hope this helps

bglick13 · October 6, 2020, 9:12pm

Thank you for the quick replies! Setting compact=False worked for me

Topic		Replies	Views
How to plot trace of either chol_corr[0,1] or chol_corr[1,0] from model with LKJCholeskyCov prior? Questions arviz	3	360	April 25, 2023
MvStudentT incorrect with Cholesky matrix argument Questions	6	695	July 18, 2018
Correlated slopes in multivariate model Questions	9	3446	July 14, 2018
Drawing a Cholesky-decomposed correlation matrix Questions	7	605	March 30, 2021
Cholesky Factor Invariance and Traceplot Errors version agnostic covariance , arviz	2	713	September 2, 2022

Diagnosing Cholesky Corr in Trace Plot

Related topics