LOO-CV for hierarcical model

Antonis · February 14, 2023, 7:39pm

Hi all

I would like to perform validation of a hierarchical model using LOO-CV. I would like to estimate the validation metric for both leaving out only one observation and leaving out an entire group of observations (LOGO-CV).

Is this possible using arviz? If not, what is your suggested approach?

Thank you in advance.

OriolAbril · March 14, 2023, 5:30pm

Hi! You can choose between loo/logo with the log likelihood data you give to loo or waic functions.

One quick example using the radon dataset (comes with ArviZ). We have 919 observations from a total of 85 counties, each county has a different number of observations.

import arviz as az
idata = az.load_arviz_data("radon")
az.loo(idata, var_name="y")

Computed from 2000 posterior samples and 919 observations log-likelihood matrix.

         Estimate       SE
elpd_loo -1027.18    28.85
p_loo       26.82        -
------

Pareto k diagnostic values:
                         Count   Pct.
(-Inf, 0.5]   (good)      919  100.0%
 (0.5, 0.7]   (ok)          0    0.0%
   (0.7, 1]   (bad)         0    0.0%
   (1, Inf)   (very bad)    0    0.0%

with this we perform leave one out, as the original pointwise log likelihood data is for this. See how the loo output says 2000x919. If instead we compute probabilities for whole groups, we get:

idata.log_likelihood["c"] = idata.log_likelihood.y.groupby(idata.constant_data["county_idx"]).sum()

Computed from 2000 posterior samples and 85 observations log-likelihood matrix.

         Estimate       SE
elpd_loo -1028.21   183.50
p_loo       24.16        -

There has been a warning during the calculation. Please check the results.
------

Pareto k diagnostic values:
                         Count   Pct.
(-Inf, 0.5]   (good)       60   70.6%
 (0.5, 0.7]   (ok)         18   21.2%
   (0.7, 1]   (bad)         7    8.2%
   (1, Inf)   (very bad)    0    0.0%

Antonis · April 4, 2023, 6:40pm

Thank you! This approach solves my problem.

Topic		Replies	Views
Leave on group out cross validation v5 modeling	8	1070	December 19, 2022
Can loo_i in arviz.loo be used to get an estimate of loo for the last 30 datapoints? v5 arviz	0	424	May 9, 2023
AUC with PSIS-LOO-CV Questions	2	758	May 23, 2020
Custom log likelihood and LOO Questions	5	1871	April 4, 2022
Getting more insights from cross-validation v5 modeling	1	282	July 11, 2023

LOO-CV for hierarcical model

Related topics