Marginalizing out a categorical variable

chartl · March 30, 2021, 5:03pm

I apologize. The notation is a bit subtle, and I screwed it up. Thinking generally, you have your parameters of interest (call them \zeta), nuisances (call them \theta), and the data (call it \mathcal{X}); and you have some likelihood P(\mathcal{X}|\zeta, \theta) and priors P(\zeta, \theta) = P(\zeta)P(\theta). The approach I’ve outlined is:

(1) Q(\mathcal{X}|\theta) = \int P(\mathcal{X}|\zeta, \theta)P(\zeta)d\zeta
(2) Q_\mathrm{post}(\theta|\mathcal{X}) = \frac{Q(\mathcal{X}|\theta)P(\theta)}{\int Q(\mathcal{X}|\theta)P(\theta)d\theta}
(3) P_\mathrm{mar}(\mathcal{X}|\zeta) = \int P(\mathcal{X}|\zeta, \theta)Q_\mathrm{post}(\theta|\mathcal{X})d\theta
(4) P_\mathrm{post}(\zeta | \mathcal{X}) = \frac{P_\mathrm{mar}(\mathcal{X}|\zeta)P(\zeta)}{\int P_\mathrm{mar}(\mathcal{X}|\zeta)P(\zeta)d\zeta}

In your case \mathcal{X} = (e, c), \theta = (\sigma, \alpha_0, \alpha_1) and \zeta = (\zeta_i). Because \zeta_i is discrete, (1) is a sum rather than integral; and (4) is proportional to (3).

This is just one pass of generalized E-M, after (4) you can go back to (1) and replace P(\zeta) with P_\mathrm{post}(\zeta|\mathcal{X}) and repeat the process. The procedure does converge to the true posteriors. And if instead of computing the full integral at each step, you instead use only single sample from the (iteratively-updated) posteriors, this procedure is exactly Gibbs sampling.

Given that you have lots of data, and that a categorical distribution is not particularly complicated, I would expect the procedure to converge quickly; I assume one iteration is enough.

Topic		Replies	Views
Marginalizing over missing categories Questions	1	708	June 17, 2020
Trouble specificying X \| a, b, c, d ~ Categorical( . ) Questions	5	494	March 2, 2019
Mixture Model Metropolis vs. NUTS Questions	1	588	April 1, 2020
How would you go about marginalizing over discrete parameters in PyMC? Questions	1	1572	May 27, 2020
Discrete RV and marginalisation v3	5	351	February 5, 2023

Marginalizing out a categorical variable

Related topics