Can you marginalize a mixture model where the draws from the different components are not independent?

erik-werner · March 1, 2024, 11:21am

The transition kernel would have to depend on the fraction of previous observations that are in the 1 state, which introduces a long-range dependence.

erik-werner · March 1, 2024, 11:23am

I’m presenting some work based on this model to an audience with lots of Stan users next week, so it would be nice to know whether this model would be possible to run in Stan, where IIUC it’s impossible to sample from discrete variables? But it seems like it would be very difficult then.

ricardoV94 · March 1, 2024, 11:31am

Yes, for the constraint to hold true, there must necessarily be global correlations of some sort. If the first 50% trials are all 1, the last 50% must be 0. Of course it shouldn’t matter which subset you choose. My idea was just to provide a hacky approximation to these global dependencies by introducing a soft local kernel, which if implemented over enough n_lags may not be too terrible.

Here is the idea with a single lag, which MarginalModel could marginalize automatically for you (but not more lags unfortunately):

import pymc as pm
import pymc_experimental as pmx

with pm.Model() as m:
    idx = pmx.distributions.DiscreteMarkovChain("idx", P=[[0.1, 0.9], [0.9, 0.1]], shape=(10,))
    emission = pm.Normal("emission", pm.math.where(idx, 1, -1), sigma=0.1, shape=(10,))
    iid_emission = pm.NormalMixture("iid_emission", w=[0.5, 0.5], mu=[1, -1], sigma=0.1, shape=(10,))
    
print((pm.draw(emission, draws=100) > 0).mean(-1).std(), (pm.draw(iid_emission, draws=100) > 0).mean(-1).std())
# 0.063 0.153

ricardoV94 · March 1, 2024, 11:34am

Yes without sampling discrete variables it seems computationally very challenging, although I’m sure someone will find a math trick of some sort. It’s a very interesting problem btw!

erik-werner · March 1, 2024, 11:58am

to PyMC for allowing me to sample from the model without math tricks

I had one final question: One downside of the potential trick is that I can’t sample new draws from this distribution directly, as sample_posterior_predictive(trace, var_names=["group_assignment", "y"]) ignores the potential. I’ve solved this in a hacky way by setting p to 0 and 1 to sample from the pure distribution, and then combining such samples manually. But it would of course be nice to be able to sample from the model directly. Is it possible to rewrite it in some way to enable this? My understanding is that it’s impossible as the potential is by necessity downstream of the group assignment. But am I missing something clever?

ricardoV94 · March 1, 2024, 12:00pm

The Potential is just a blind warning. In your case you are sampling conditioned on the posterior indicator variables, so the Potential doesn’t matter anymore right? It would only matter if you were interested in resampling the indicator variables (or in prior_predictive).

erik-werner · March 1, 2024, 12:01pm

Right, but I do want to resample the group_assignment variable.

ricardoV94 · March 1, 2024, 12:07pm

Right that you can’t because you haven’t defined a pure random generating model. You can do some rejection based sampling of the indicator variables, just have to be careful to match the Potential penalty term if you want to be consistent.

Topic		Replies	Views
Mixture with multiple observations Questions	6	2299	December 14, 2017
Mixture Model Metropolis vs. NUTS Questions	1	583	April 1, 2020
2D Gaussian Mixture Model Questions	7	2060	August 25, 2017
Marginal likelihood for distributions with discrete variables v3	12	1384	November 14, 2022
Issue on 2-D marginalized GMM Questions	10	892	November 20, 2018

Can you marginalize a mixture model where the draws from the different components are not independent?

Related topics