Can you marginalize a mixture model where the draws from the different components are not independent?

Yes, for the constraint to hold true, there must necessarily be global correlations of some sort. If the first 50% trials are all 1, the last 50% must be 0. Of course it shouldn’t matter which subset you choose. My idea was just to provide a hacky approximation to these global dependencies by introducing a soft local kernel, which if implemented over enough n_lags may not be too terrible.

Here is the idea with a single lag, which MarginalModel could marginalize automatically for you (but not more lags unfortunately):

import pymc as pm
import pymc_experimental as pmx

with pm.Model() as m:
    idx = pmx.distributions.DiscreteMarkovChain("idx", P=[[0.1, 0.9], [0.9, 0.1]], shape=(10,))
    emission = pm.Normal("emission", pm.math.where(idx, 1, -1), sigma=0.1, shape=(10,))
    iid_emission = pm.NormalMixture("iid_emission", w=[0.5, 0.5], mu=[1, -1], sigma=0.1, shape=(10,))
    
print((pm.draw(emission, draws=100) > 0).mean(-1).std(), (pm.draw(iid_emission, draws=100) > 0).mean(-1).std())
# 0.063 0.153
1 Like