Mixture of bounded normals as a prior


I’m trying to sample from model in which I have a hierarchical prior which is a mixture of 2 2-dimensional bounded normals. I also need to keep the order of the variables and for that reason I use a Deterministic to sort the output. Here is the relevant piece of the model I’m trying to sample from:

model = pm.Model()
with model:

        muCB_ = pm.Uniform('muCB_', lower=0.0, upper=1.0, testval=[0.3, 0.8], shape=2)
        muCB = pm.Deterministic('muCB', tt.sort(muCB_))

        muCB2_ = pm.Uniform('muCB2_', lower=0.0, upper=1.0, testval=[0.3, 0.8], shape=2)
        muCB2 = pm.Deterministic('muCB2', tt.sort(muCB2_))

        sdCB = pm.HalfNormal('sdCB', sd=0.05, shape=2)
        sdCB2 = pm.HalfNormal('sdCB2', sd=0.05, shape=2)

        w = pm.Dirichlet('w', np.ones(2))

        bounded_normal = pm.Bound(pm.Normal, lower=0.0, upper=1.0)
        CB_ = bounded_normal.dist('CB_', mu=muCB, sd=sdCB, testval=np.array([0.3,0.8]), shape=(n,2))            
        CB2_ = bounded_normal.dist('CB2_', mu=muCB2, sd=sdCB2, testval=np.array([0.3,0.8]), shape=(n,2))
        comp_dists = [CB_, CB2_]

        mix = pm.Mixture('mix', w=w, comp_dists = comp_dists)

        CB = pm.Deterministic.dist('CB', tt.sort(mix, axis=1))

        # q is a complicated function of CB[:,0] and CB[:,1]

        qobs = pm.Normal('qobs', mu=q, sd=sigmaq, observed=qobs, shape=n)

However, I’m getting the following error, that looks like a shape error somewhere. Anyone has a hint?

TypeError                                 Traceback (most recent call last)
/scratch/Dropbox/GIT/axial_ratio/axial_mixture.py in <module>()
180     out.read_obs('data/small_plus_uncer.dat')
--> 181     out.sample('data/small_plus_uncer.samples', noncentered=False)
183     f, ax = pl.subplots(nrows=2, ncols=2, figsize=(8,8))

/scratch/Dropbox/GIT/axial_ratio/axial_mixture.py in sample(self, name_chain, noncentered)
135                 comp_dists = [CB_, CB2_]
--> 137                 mix = pm.Mixture('mix', w=w, comp_dists = comp_dists)
138                 # bounded_normal = pm.Bound(pm.Normal, lower=0.0, upper=1.0)

/scratch/Dropbox/GIT/pymc3/pymc3/distributions/distribution.py in __new__(cls, name, *args, **kwargs)
 35             total_size = kwargs.pop('total_size', None)
 36             dist = cls.dist(*args, **kwargs)
---> 37             return model.Var(name, dist, data, total_size)
 38         else:
 39             raise TypeError("Name needs to be a string but got: {}".format(name))

/scratch/Dropbox/GIT/pymc3/pymc3/model.py in Var(self, name, dist, data, total_size)
800                 with self:
801                     var = FreeRV(name=name, distribution=dist,
--> 802                                  total_size=total_size, model=self)
803                 self.free_RVs.append(var)
804             else:

/scratch/Dropbox/GIT/pymc3/pymc3/model.py in __init__(self, type, owner, index, name, distribution,           
total_size, model)
1179             self.tag.test_value = np.ones(
1180                 distribution.shape, distribution.dtype) * distribution.default()
-> 1181             self.logp_elemwiset = distribution.logp(self)
1182             # The logp might need scaling in minibatches.
1183             # This is done in `Factor`.

/scratch/Dropbox/GIT/pymc3/pymc3/distributions/mixture.py in logp(self, value)
143         w = self.w
--> 145         return bound(logsumexp(tt.log(w) + self._comp_logp(value), axis=-1),
146                      w >= 0, w <= 1, tt.allclose(w.sum(axis=-1), 1),
147                      broadcast_conditions=False)

/scratch/Dropbox/GIT/pymc3/pymc3/distributions/mixture.py in _comp_logp(self, value)
108         try:
--> 109             value_ = value if value.ndim > 1 else tt.shape_padright(value)
111             return comp_dists.logp(value_)

/scratch/miniconda3/envs/py36/lib/python3.6/site-packages/theano/tensor/basic.py in shape_padright(t, 
4445     pattern = [i for i in xrange(_t.type.ndim)] + ['x'] * n_ones
-> 4446     return DimShuffle(_t.broadcastable, pattern)(_t)

/scratch/miniconda3/envs/py36/lib/python3.6/site-packages/theano/gof/op.py in __call__(self, *inputs, 
623             for i, ins in enumerate(node.inputs):
624                 try:
--> 625                     storage_map[ins] = [self._get_test_value(ins)]
626                     compute_map[ins] = [True]
627                 except AttributeError:

/scratch/miniconda3/envs/py36/lib/python3.6/site-packages/theano/gof/op.py in _get_test_value(cls, v)
560             # ensure that the test value is correct
561             try:
--> 562                 ret = v.type.filter(v.tag.test_value)
563             except Exception as e:
564                 # Better error message.

/scratch/miniconda3/envs/py36/lib/python3.6/site-packages/theano/tensor/type.py in filter(self, data, strict, 
176             raise TypeError("Wrong number of dimensions: expected %s,"
177                             " got %s with shape %s." % (self.ndim, data.ndim,
--> 178                                                         data.shape))
179         if not data.flags.aligned:
180             try:

TypeError: For compute_test_value, one input test value does not have the requested type.

The error when converting the test value to that variable type:
Wrong number of dimensions: expected 0, got 1 with shape (2,).

Specifying the shape in the mixture should fix the error:
mix = pm.Mixture('mix', w=w, comp_dists = comp_dists, shape=(n, 2))

However, a few point you should fix:
1, using tt.sort does not impose ordering from the sampler standpoint, as the logp is not modified with tt.sort. You should use an Ordered transformation (still WIP PR on Github, but you can search the discourse for some more informaiton)
2, CB = pm.Deterministic.dist('CB', tt.sort(mix, axis=1)) is not allow - I guess you meant CB = pm.Deterministic('CB', tt.sort(mix, axis=1)) However, sorting the mixture is pretty weird choice - i would suggest removing it.

Thanks for the suggestion to fix the dimensions error. Concerning your other points:

  1. So you mean that if I use tt.sort I don’t get the Jacobian properly multiplied with the posterior, right? I’m apparently getting nice results from my original model (without mixtures) but I will need to test if there is a problem.
  2. This one was my mistake. I pasted the wrong code.

Yep that’s what I meant. You sometimes still get OK result with sort because the trace is basically staying in one mode, but in general it is not a valid solution.

Thanks. I have just read you answer in https://discourse.pymc.io/t/mixture-models-and-breaking-class-symmetry/208 and also read it in the Stan’s manual. It looks, though, that the correction is probably small but I have to introduce it.