Hi there,
Suppose I have the following two datasets for a simple hierarchical linear regression with no intercept:
- Dataset 1: X1, Y1
- Dataset 2: X2, Y2
X1, X2, Y1 and Y2 are all scalars.
Dataset 2 is a transformed version of Dataset 1, where the X and Y values can be:
- Different
- It can also have a different number of observations (due to filtering or other things)
I am reasonably well-versed in using WAIC/LOO for model comparison on a fixed dataset, but would it ever make sense to use WAIC/LOO for model comparison when the datasets are different?
What I’m trying to understand is for which dataset a given specification of linear model is most likely to generalize - for my purposes, I can actually use either and recover the quantity of interest at the end.
For issue 2. I am aware that WAIC/LOO scale with dataset size, but would it be possible to divide by the sample size here to get a kind of normalized estimate? I think this is being done in section 9.3.1 of this book:
https://bookdown.org/marklhc/notes_bookdown/model-comparison-and-regularization.html
For issue 1. I am not sure.
Any help would be much appreciated!