Hi @OriolAbril, you raise some valid points. I’ll open a new topic to see if I can address some of these. Thanks!
@shbhuk did you ever figure this out for exoplanet?
I got a hack to work to concatenate multiple likelihoods in a single model, as @OriolAbril suggests.
Here, opt.models.values() are my different models, my observations are named 12CN-1 and 12CN-2 with associated coordinates 12CN-1_dim_0 and 12CN-2_dim_0. I rename the observations to spec, rename the coordinate axes to dim_0, then concatenate the two coordinates. The result is a new, single “observation” named 12CN-1, 12CN-2 with coordinate dim_0.
import xarray as xr
for model in opt.models.values():
model.trace.log_likelihood["12CN-1, 12CN-2"] = xr.concat(
[
model.trace.log_likelihood.rename({"12CN-1": "spec"})["spec"].rename({"12CN-1_dim_0": "dim_0"}),
model.trace.log_likelihood.rename({"12CN-2": "spec"})["spec"].rename({"12CN-2_dim_0": "dim_0"}),
],
dim="dim_0",
)
Then I can compare models with WAIC/LOO using
az.compare({cloud: model.trace for cloud, model in opt.models.items()}, var_name=“12CN-1, 12CN-2”)
Aki Vehtari put together a really nice FAQ around cross-validation:
It starts by explaining that it’s trying to estimate expected log posterior density then evaluating the consequences of that. It covers the idea of exchangeability (section 7) that @OriolAbril put his finger on and how LOO is related to WAIC (section 21).
If your data generating process is mixture, you need to estimate the mixture rate to get a properly generative model. At that point, you will have exchangeability and can technically apply LOO or WAIC. But it’s not clear that LOO is going to be a good way to compare.