Is model with high or low WAIC and LOO better


According to many materials online, models with lower WAIC value or higher elpd_loo should be better.
However, after I used, the top one model has the highest WAIC and elpd_loo at the same time. Can anybody try to explain this? Thank you!

1 Like

Both quantities try to estimate the predictive accuracy of the model on unseen data. As we don’t have this unseen data available, we need make an approximation. How this approximation is done is what defines WAIC or LOO. Both quantities however can be represented on different scales. The most common ones, and the ones available in ArviZ are:

  • log : log-score, default for both loo and waic in ArviZ
  • negative_log : -1 * log-score
  • deviance : -2 * log-score, common for historical reasons so WAIC and LOO could be compared with older information criteria like Akaike information criterion (AIC) or the deviance information criterion (DIC). For this same reason, the original WAIC paper defined WAIC with this -2 term which explains why this scale factor is often ignored and it’s simply said that lower WAIC is better.

Therefore, as indicated in ArviZ docs, a higher log-score (or a lower deviance or negative log_score) indicates a model with better predictive accuracy.

The object returned by ArviZ contains information about the scale. This is taken into account by compare to correctly sort the models and also affects the names in the WAIC/LOO printout:

In [6]: az.loo(idata) # scale is log, default
Computed from 2000 by 919 log-likelihood matrix

         Estimate       SE
elpd_loo -1027.14    28.85
p_loo       26.78        -

In [7]: az.loo(idata, scale="deviance")
Computed from 2000 by 919 log-likelihood matrix

             Estimate       SE
deviance_loo  2054.27    57.70
p_loo           26.78        -

Also worth noting that the default can be changed using ArviZ rcParams