Comparing Models with Marginalized Latent Discrete Variables in PyMC

qipengchen · March 18, 2024, 12:11pm

Hi everyone,

I’m working on comparing models in PyMC, especially ones where I’ve marginalized latent discrete variables. I’m wondering if it’s straightforward to use the log-likelihood output from PyMC for this comparison, or if there are additional steps needed.

Does anyone have experience or insights on this? Also, are there any references you could recommend on this topic?

Thanks in advance for your help!

jessegrabowski · March 18, 2024, 12:42pm

Depending on what types of latent variables you’re marginalizing, you could try the new automatic marginalization features in pymc-experimental. They now support automatic un-marginalization as well, so you can fit the model then automatically recover the posterior distributions over the discrete variables ~~(though the .unmarginalize() method is currently undocumented, but there’s a PR for an example notebook here).~~

I’m a big dummy, the PR was merged. The example notebook to check out is here

You can of course recover things by hand by using the logp values directly. There’s an example of doing this at the end of this blog post (scroll down to “recovering mixture indexes”, although the whole post is important to read if you’re going to go this route).

qipengchen · March 19, 2024, 7:04am

Big thanks for the quick and insightful reply! That is very useful for my work ! Can’t wait to try the automatic marginalization features.

Topic		Replies	Views
How would you go about marginalizing over discrete parameters in PyMC? Questions	1	1575	May 27, 2020
Marginalizing out a categorical variable Questions	18	2202	April 1, 2021
Discrete RV and marginalisation v3	5	353	February 5, 2023
Marginal likelihood for distributions with discrete variables v3	12	1450	November 14, 2022
Fail to run example on "Automatic marginalization of discrete variables" v5	2	23	February 10, 2025

Comparing Models with Marginalized Latent Discrete Variables in PyMC

Related topics