Model comparison 101

Hi there,

I would like to take advantage of the marginal likelihood calculated with SMC. In my case, the model combines several “components” each with their own set of parameters. I have a custom likelihood function L that combines the likelihood for several observables. When I compare the marginal likelihood with a varying number of components, as expected (I think…), I see that it first increases, presumably until a sufficient number of component is reached, and then decreases, presumably because the models become penalized due to the high dimensionality.

Now here’s my question. I understand that models can be compared safely only if they have converged, but are they also supposed to have the same likelihood (L) values? What bothers me is that, say, I have a larger number of components, but for some reason L is not as high. I suspect this is because the model could do better because the higher number of dimensions the better to reproduce the data, but I’m just wondering how we can compare models using “simply” the marginal likelihood, which integrates out the parameters, apparently ignoring the likelihood L values.


Another question I should have asked is: if one has to account for the prior odds to each model P(M) because they’re not a priori the same, I’m not sure I understand what this corresponds to. It seems we’re talking about the odds of a model integrating out the parameters and the data? Would you happen to know of an example in which the Bayes factor needs to be updated by the model likelihood?