WAIC & LOO scaling

I posted this over on Github but just wanted to put it here too in case that’s helpful - feel free to ignore!

I’ve just noticed that the newest ArviZ release (0.7.0) has changed from deviance to log scale for WAIC and LOO statistics, which PyMC3 relies on. This is contrary to previous PyMC3 behaviour (and the documentation https://docs.pymc.io/api/stats.html). The PyMC3 requirements file indicates ArviZ>=0.4.1, so new installs will get 0.7.0 with this changed behaviour.

I think this is something that needs to be dealt with fairly rapidly as it can actually invert previous PyMC3 results (i.e. flipping from lower WAIC=better to higher=better) without any warning. Had I not noticed this I would likely have proceeded to publish work based on inverted model comparison stats, which would not have been good…

1 Like

Hi Toby,
Thanks for reporting this!

We’re still deciding on what to do but we’ll probably add a note or warning to the relevant functions’ outputs. This warning will be temporary though, since the DataFrame is already ordered (“a DataFrame, ordered from best to worst model (measured by information criteria)”), and we also provide a rank number. So, a permanent warning would probably be redundant.

Regarding the synchronization of PyMC stats docs and ArviZ docs, this is a different but adjacent issue that will be dealt with too :wink:
Hope this helps :vulcan_salute: