Yes, for the constraint to hold true, there must necessarily be global correlations of some sort. If the first 50% trials are all 1, the last 50% must be 0. Of course it shouldn’t matter which subset you choose. My idea was just to provide a hacky approximation to these global dependencies by introducing a soft local kernel, which if implemented over enough n_lags may not be too terrible.
Here is the idea with a single lag, which MarginalModel could marginalize automatically for you (but not more lags unfortunately):
import pymc as pm
import pymc_experimental as pmx
with pm.Model() as m:
idx = pmx.distributions.DiscreteMarkovChain("idx", P=[[0.1, 0.9], [0.9, 0.1]], shape=(10,))
emission = pm.Normal("emission", pm.math.where(idx, 1, -1), sigma=0.1, shape=(10,))
iid_emission = pm.NormalMixture("iid_emission", w=[0.5, 0.5], mu=[1, -1], sigma=0.1, shape=(10,))
print((pm.draw(emission, draws=100) > 0).mean(-1).std(), (pm.draw(iid_emission, draws=100) > 0).mean(-1).std())
# 0.063 0.153