I would like to perform validation of a hierarchical model using LOO-CV. I would like to estimate the validation metric for both leaving out only one observation and leaving out an entire group of observations (LOGO-CV).
Is this possible using arviz? If not, what is your suggested approach?
Hi! You can choose between loo/logo with the log likelihood data you give to loo or waic functions.
One quick example using the radon dataset (comes with ArviZ). We have 919 observations from a total of 85 counties, each county has a different number of observations.
import arviz as az
idata = az.load_arviz_data("radon")
az.loo(idata, var_name="y")
with this we perform leave one out, as the original pointwise log likelihood data is for this. See how the loo output says 2000x919. If instead we compute probabilities for whole groups, we get: