Create correlation matrix from posterior estimates

Hi all,

I have the following arviz.InferenceData-object:

fit = az.from_pystan(
            'driver': list(drivers_idx.keys()),
            'team': list(teams_idx.keys()),
            'circuit': list(circuits_idx.keys())
            'theta': ['driver'],
            'tau': ['team'],
            'phi': ['team', 'circuit']

Basically, theta represent the skill of a driver (for example Max Verstappen), tau represent the skill of a team (Red Bull) and phi represent the circuit-specific skill of a team (Red Bull on Circuit of Monaco.
I want to create a matrix of shape (n_teams, n_circuits) of posterior means of phi - tau. How do I make sure that the pairwise subtraction is done based on the shared team-coords?

Can this matrix be seen as a correlation matrix or which additional steps are needed to create one?
The goal is to measure the similarity between the circuits.

I think this is an xarray question. If you do elementwise operations (like subtraction) between two xarray DataArrays, it will automatically broadcast across dimensions that share the same name. So something like this will do what you want:

phi_post = fit.posterior.phi # (chain, draw, n_circuits)
tau_post = fit.posterior.tau # (chain, draw, n_teams)
difference = phi_post - tau_post # (chain, draw, n_circuits, n_teams)

This is not a correlation matrix, it’s just a matrix of pairwise posterior distances between the parameters. Most obviously, there’s nothing that forces the differences to be between -1 and 1. If you want to estimation correlations between teams and courses, you should explicitly include it in your model. I don’t know anything about STAN, so I can’t help you there :slight_smile:

1 Like