Model Comparison, DiscreteUniform vs. Bernoulli, ElemwiseCategorical vs. CategoricalGibbsMetropolis

Dear group,

I am currently experimenting around with model comparison in the bayesian context. I use as basis chapter 12 of Kruschke and the pymc3 solution given here: https://github.com/aloctavodia/Doing_bayesian_data_analysis/blob/master/12_OneOddGroupModelComp.py

The solution works. No problem with that.

But when I run this script then it tells me: “ElemwiseCategorical is deprecated, switch to CategoricalGibbsMetropolis”.

When I do that it tells me that I cannot use DiscreteUniform, but have to use either a binary or categorical distribution, so I switch to Bernoulli. The changes I made are the following:
from: ch12_model_index = pm.DiscreteUniform(‘model_index’, lower=0, upper=1)
to: ch12_model_index = pm.Bernoulli(‘model_index’, p=0.5)
from: ch12_step = pm.ElemwiseCategorical(vars=[ch12_model_index],values=[0,1])
to: ch12_step = pm.CategoricalGibbsMetropolis(vars=[ch12_model_index]) #, order=[0,1]

I am running the sampling process with 4 chains:
ch12_trace = pm.sample(4000, step=ch12_step, random_seed=42, chains=4, tune=1000) #, nuts_kwargs=dict(target_accept=.85)

Now I run into the problem that the model_index is fixed at 1 and I don’t get a “model comparison” solution (like mean: 0.26 before). Even if I keep the ElemwiseCategorical and only switch from DiscreteUniform to Bernoulli I get several indicators for the model not converging, e.g. Rhat away from 1.0 and n_eff less than 200.

How would I correctly replace the deprecated ElemwiseCategorical with CategoricalGibbsMetropolis here?

And in general: are there somewhere more examples of how to correctly do model comparisons with pymc3?

Thanks a lot! Any hints in any direction are very welcome!
Christian

There are some problem with our Discrete Gibbs sampler (see https://github.com/pymc-devs/pymc3/issues/2879 and https://github.com/pymc-devs/pymc3/issues/1563), and they are in general not well maintained. The solution is not to have discrete latent node in your model, and rewrite it into a mixture model.

Thanks a lot for your quick answer!

Are there some examples on how to rewrite the model into a mixture model? And in general: are there somewhere more examples on how to do model comparison (which requires some latent discrete node) with pymc3?

Actual this was always one of the reasons why I prefer pymc3 over stan, because pymc3 can deal with discrete nodes.

Thanks!
Christian

Well, you can write down model in PyMC3 with discrete latent variable, but it does not always work in terms of inference…

For model comparison, LOO and WAIC is the two best modern option you can use.

LOO is Leave One Out Cross Validation and WAIC is Watanabe–Akaike Information Criterion, correct?
Then you’re basically saying that model bayesian model comparison via a hierarchical model is “out”? Is this because of its technical difficulties (like my problem) or some fundamental theoretical reason?
Thanks!
Christian

Yep

Mostly because of the problem in inference, you need to write a huge mixture model, so you have

  1. difficulty of inferencing a mixture model
  2. difficulty of scaling (you have a much larger model now)
  3. not flexible, if you want to add a new model you need to rewrite + run all sampling again, whereas LOO or WAIC you just train a new model.

Having said that, there are ways to do it properly, eg see Motif of the Mind | Junpeng Lao, PhD, which based on Christian Robert’s paper: https://arxiv.org/pdf/1412.2044v2.pdf

Thank you very much for your links! While looking through your blog I also found: https://junpenglao.xyz/Blogs/posts/2017-11-22-Marginal_likelihood_in_PyMC3.html

This is basically answering my question even more directly on how to get to the model marginal probabilities and then to the bayes factors. Great blog posts!!

By the way: at the bottom of that post you say: “I guess it is time to double check the marginal likelihood estimation in SMC.” Did you find the root cause why the different methods deliver differing results? Just out of couriosity. My core question is answered now.

Yes, after https://github.com/pymc-devs/pymc3/pull/3124 the computation from SMC is much closer to the output from the bridge sampler. But more experiment is still needed.

Also, be careful what you are wishing for when you are using Bayes Factor. I remain skeptical of the accuracy in terms of computing it, and its usefulness.