Sampling uniformly in a triangular support

Background (for entertainment purposes):

I want to encode the following (prior) knowledge as a probability (density):

  1. x[0],x[1] belong to [xl,xu] interval
  2. x[0] < x[1]

What I was doing in emcee is just reject the proposals that fail condition 2.
When I went on to code this in pymc3, I used a “smart” :man_facepalming: trick that worked for a related problem, to code a generative model. The trick is to rescale the draws from x[0] to [x[1],xu] range. Like so:

#model params:
N = 2
xl = 0
xu = 1

triangle = pm.Model()
with triangle:
    x_naive = pm.Uniform('x_naive',lower = xl, upper = xu, shape = N)
    x_without_a_name = tt.set_subtensor(x_naive[0],(x_naive[0]-xl)/(xu-xl)*(xu-x_naive[1])+xl)
    x = pm.Deterministic('x',x_without_a_name) # this is just to attach a name

with triangle:
    trace = pm.sample(draws = 100000,njobs = 1)

There is a “tiny” problem though: density thus produced is not proper and does not really convey the intention :wink: .

The question

How would one procede to code a uniform prior with boundaries of the form x[i] < x[j] for use with NUTS?

There are a couple of ways to do it:

  1. You can use the ordered transformation here.
  2. You can add jacobian correction for the volume change in your case above as well to make it proper.
  3. You can jointly sample from a bivariate standard Gaussian then rotating them by 45 degrees. The rotated sample is transformed into rates that lie in the unit square. See here.
  4. You can add a potential to restrict the order, see here cell [4] for an example

Thanks, @junpenglao, got my stuff to work (so good so far) with Potential and using a sigmoid instead of the switch. The sampling rate has dropped approximately 8-fold, which is OK.
I don’t quite understand how to use Odered transform though (thought to check if it was faster).
I tried

#model params:
N = 2
xl = 0
xu = 1

class Ordered(pm.distributions.transforms.ElemwiseTransform):
    name = "ordered"

    def forward(self, x):
        out = tt.zeros(x.shape)
        out = tt.inc_subtensor(out[0], x[0])
        out = tt.inc_subtensor(out[1:], tt.log(x[1:] - x[:-1]))
        return out

    def forward_val(self, x, point=None):
        x, = pm.distributions.distribution.draw_values([x], point=point)
        return self.forward(x)

    def backward(self, y):
        out = tt.zeros(y.shape)
        out = tt.inc_subtensor(out[0], y[0])
        out = tt.inc_subtensor(out[1:], tt.exp(y[1:]))
        return tt.cumsum(out)

    def jacobian_det(self, y):
        return tt.sum(y[1:])


triangle = pm.Model()
with triangle:
    """
    This is to test the assertion that rescaling approach to create generative models with linear constraints produces 
    improper densities
    Knowledge: 
    1) x[0],x[1] \in [xl,xu]
    2) x[0] < x[2]
    """
    x_naive = pm.Uniform('x',
                         lower = xl,
                         upper = xu,
                         shape = N,
                         transform=Ordered())

with triangle:
    trace = pm.sample(draws = 5000,njobs = 1)

with various combinations of shape parameter of Uniform transform: N, (N), (N,1), (1,N) , and tried increasing N, all giving this error:

Traceback (most recent call last):
  File "/mnt/emptyplaceholder/tmp/pycharm-2017.3.2/helpers/pydev/pydev_run_in_console.py", line 53, in run_file
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/mnt/emptyplaceholder/tmp/pycharm-2017.3.2/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/mnt/emptyplaceholder/mcmc/Experiments/20180119_1_triangle/sample_ordered.py", line 55, in <module>
    transform=Ordered())
  File "/mnt/emptyplaceholder/miniconda/lib/python3.6/site-packages/pymc3/distributions/distribution.py", line 37, in __new__
    return model.Var(name, dist, data, total_size)
  File "/mnt/emptyplaceholder/miniconda/lib/python3.6/site-packages/pymc3/model.py", line 759, in Var
    model=self)
  File "/mnt/emptyplaceholder/miniconda/lib/python3.6/site-packages/pymc3/model.py", line 1393, in __init__
    transformed_name, transform.apply(distribution), total_size=total_size)
  File "/mnt/emptyplaceholder/miniconda/lib/python3.6/site-packages/pymc3/distributions/transforms.py", line 38, in apply
    return TransformedDistribution.dist(dist, self)
  File "/mnt/emptyplaceholder/miniconda/lib/python3.6/site-packages/pymc3/distributions/distribution.py", line 47, in dist
    dist.__init__(*args, **kwargs)
  File "/mnt/emptyplaceholder/miniconda/lib/python3.6/site-packages/pymc3/distributions/transforms.py", line 63, in __init__
    testval = forward(dist.default())
  File "/mnt/emptyplaceholder/mcmc/Experiments/20180119_1_triangle/sample_ordered.py", line 24, in forward
    out = tt.inc_subtensor(out[0], x[0])
  File "/mnt/emptyplaceholder/miniconda/lib/python3.6/site-packages/theano/tensor/var.py", line 579, in __getitem__
    lambda entry: isinstance(entry, Variable)))
  File "/mnt/emptyplaceholder/miniconda/lib/python3.6/site-packages/theano/gof/op.py", line 615, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "/mnt/emptyplaceholder/miniconda/lib/python3.6/site-packages/theano/tensor/subtensor.py", line 481, in make_node
    raise exception
ValueError: The index list is longer (size 1) than the number of dimensions of the tensor(namely 0). You are asking for a dimension of the tensor that does not exist! You might need to use dimshuffle to add extra dimension to your tensor.

When I stopped in the debugger on line out = tt.inc_subtensor(out[0], x[0]) which was raising the error and evaluated x, it returned an output of array(0.5) followed by an error, like so:

In[2]: x
Out[2]: array(0.5)
Error evaluating: thread_id: pid_10211_id_140233461333352
frame_id: 140232644536784
scope: FRAME
attrs: x
Traceback (most recent call last):
  File "/mnt/emptyplaceholder/tmp/pycharm-2017.3.2/helpers/pydev/_pydevd_bundle/pydevd_vars.py", line 239, in resolve_compound_variable
    return _typeName, resolver.get_dictionary(var)
  File "/mnt/emptyplaceholder/tmp/pycharm-2017.3.2/helpers/pydev/pydevd_plugins/extensions/types/pydevd_plugin_numpy_types.py", line 77, in get_dictionary
    ret['[0:%s] ' % (len(obj))] = list(obj[0:MAX_ITEMS_TO_HANDLE])
IndexError: too many indices for array

When I tried x.shape:

x.shape
Out[3]: ()
Error evaluating: thread_id: pid_10211_id_140233461333352
frame_id: 140232644536784
scope: FRAME
attrs: x
Traceback (most recent call last):
  File "/mnt/emptyplaceholder/tmp/pycharm-2017.3.2/helpers/pydev/_pydevd_bundle/pydevd_vars.py", line 239, in resolve_compound_variable
    return _typeName, resolver.get_dictionary(var)
  File "/mnt/emptyplaceholder/tmp/pycharm-2017.3.2/helpers/pydev/pydevd_plugins/extensions/types/pydevd_plugin_numpy_types.py", line 77, in get_dictionary
    ret['[0:%s] ' % (len(obj))] = list(obj[0:MAX_ITEMS_TO_HANDLE])
IndexError: too many indices for array

OK, got the toy example to work:

#model params:
N = 2
xl = 0
xu = 1

class Ordered(pm.distributions.transforms.ElemwiseTransform):
    name = "ordered"

    def forward(self, x):
        out = tt.zeros(x.shape)
        out = tt.inc_subtensor(out[0], x[0])
        out = tt.inc_subtensor(out[1:], tt.log(x[1:] - x[:-1]))
        return out

    def forward_val(self, x, point=None):
        x, = pm.distributions.distribution.draw_values([x], point=point)
        return self.forward(x)

    def backward(self, y):
        out = tt.zeros(y.shape)
        out = tt.inc_subtensor(out[0], y[0])
        out = tt.inc_subtensor(out[1:], tt.exp(y[1:]))
        return tt.cumsum(out)

    def jacobian_det(self, y):
        return tt.sum(y[1:])


class Composed(pm.distributions.transforms.Transform):
    def __init__(self, trafo1, trafo2):
        self._trafo1 = trafo1
        self._trafo2 = trafo2
        self.name = '_'.join([trafo1.name, trafo2.name])

    def forward(self, x):
        return self._trafo2.forward(self._trafo1.forward(x))

    def forward_val(self, x, point=None):
        return self.forward(x)

    def backward(self, y):
        return self._trafo1.backward(self._trafo2.backward(y))

    def jacobian_det(self, y):
        y2 = self._trafo2.backward(y)
        det1 = self._trafo1.jacobian_det(y2)
        det2 = self._trafo2.jacobian_det(y)
        return det1 + det2



triangle = pm.Model()
with triangle:
    """
    This is to test the assertion that rescaling approach to create generative models with linear constraints produces 
    improper densities
    Knowledge: 
    1) x[0],x[1] \in [xl,xu]
    2) x[0] < x[2]
    """
    the_transform = Composed(pm.distributions.transforms.Interval(xl,xu),Ordered())
    x_naive = pm.Uniform('x',
                         lower = xl,
                         upper = xu,
                         shape = N,
                         transform=the_transform,
                         testval=[0.1,0.9])

with triangle:
    trace = pm.sample(draws = 5000,njobs = 1)

pm.traceplot(trace)
plt.show()

y = trace['x']
fig,ax = plt.subplots(figsize =(20,15))
plt.plot(y[:,0],y[:,1],'k.')
plt.savefig('triangle.png')
plt.show()

The resulting density is not what’s expected though.

Probably you were right about the Jacobian being incorrect ,in a different post.

Thanks for the update. Interesting, I didnt release that using Ordered actually introduce non-Uniform random (They might be uniform in another space, back to the drawing board…)

I was veryfing the maths and all seems correct, but I’m a bit concerned that in the Composed class forward_val ignores the point argument.

@junpenglao, check this out ! I just discovered that ElemwiseTransform implements the (log) jacobian determinant generically, but we override it with our hacks. When I disable the override all works nicely!!

# class Identity_transform(pm.distributions.transforms:

class Ordered(pm.distributions.transforms.ElemwiseTransform):
    name = "ordered"

    def forward(self, x):
        out = tt.zeros(x.shape)
        out = tt.inc_subtensor(out[0], x[0])
        out = tt.inc_subtensor(out[1:], tt.log(x[1:] - x[:-1]))
        return out

    def forward_val(self, x, point=None):
        x, = pm.distributions.distribution.draw_values([x], point=point)
        return self.forward(x)

    def backward(self, y):
        out = tt.zeros(y.shape)
        out = tt.inc_subtensor(out[0], y[0])
        out = tt.inc_subtensor(out[1:], tt.exp(y[1:]))
        return tt.cumsum(out)

    # def jacobian_det(self, y):
    #     return tt.sum(y[1:])


class Composed(pm.distributions.transforms.Transform):
    def __init__(self, trafo1, trafo2):
        self._trafo1 = trafo1
        self._trafo2 = trafo2
        self.name = '_'.join([trafo1.name, trafo2.name])

    def forward(self, x):
        return self._trafo2.forward(self._trafo1.forward(x))

    def forward_val(self, x, point=None):
        return self.forward(x)

    def backward(self, y):
        return self._trafo1.backward(self._trafo2.backward(y))

    def jacobian_det(self, y):
        y2 = self._trafo2.backward(y)
        det1 = self._trafo1.jacobian_det(y2)
        det2 = self._trafo2.jacobian_det(y)
        return det1 + det2



triangle = pm.Model()
with triangle:
    """
      Knowledge: 
    1) x[0],x[1] \in [xl,xu]
    2) x[0] < x[2]
    """
    the_transform = Composed(pm.distributions.transforms.Interval(xl,xu),Ordered())
    x_naive = pm.Uniform('x',
                         lower = xl,
                         upper = xu,
                         shape = N,
                         transform=the_transform,
                         testval=[0.4,0.6])

with triangle:
    trace = pm.sample(draws = 5000,njobs = 1)

pm.traceplot(trace)
plt.show()

y = trace['x']
fig,ax = plt.subplots(figsize =(20,15))
plt.plot(y[:,0],y[:,1],'k.')
plt.savefig('triangle.png')
plt.show()

pass

1 Like

Will report on speed compared to Potential approach in real application in a couple of hours

1 Like

Thanks a lot! This is really helpful for our implementation in the WIP PR!!!

1 Like

:)) Glad to help. I’d like to make a couple of comments while we’re at it:

  1. It’s strange that the simple by-hand calculation (with the log-det of triangular matrix of exponents) is incorrect.
  2. Again it looks strange (for the outsider) that Composed class forward_val ignores the point argument.
  3. It would be nice to have the identity transform implemented explicitly in pm.distributions.transforms, then things like Composed work when someone wants to test them with transform=None
  1. I agree, we will take a closer look at that Composed transform. I think there are opportunities there to implement something closer to bijector.chain in tensorflow distribution.
  2. Yep, I think it might be better to do self._trafo2.forward_val(self._trafo1.forward_val(x, point=point), point=point)
  3. That’s a great idea.

This is really quite puzzling, the computation all seems correct, also if I construct the model by hand it all works fine…

class Ordered(pm.distributions.transforms.ElemwiseTransform):
    name = "ordered"

    def forward(self, y):
        x = tt.zeros(y.shape)
        x = tt.inc_subtensor(x[0], y[0])
        x = tt.inc_subtensor(x[1:], tt.log(y[1:] - y[:-1]))
        return x

    def forward_val(self, x, point=None):
        x, = pm.distributions.distribution.draw_values([x], point=point)
        return self.forward(x)

    def backward(self, x):
        y = tt.zeros(x.shape)
        y = tt.inc_subtensor(y[0], x[0])
        y = tt.inc_subtensor(y[1:], tt.exp(x[1:]))
        return tt.cumsum(y)

    def jacobian_det(self, y):
        return tt.sum(y[1:])

with pm.Model() as mtest:
    x = pm.Flat('x_real', shape=2, testval=[2.,1.])
    Order = Ordered()
    pm.Potential('y1', Order.jacobian_det(x))
    
    x1 = Order.backward(x)
    Interval = pm.distributions.transforms.Interval(0., 1.)
    x2 = pm.Deterministic('x_interval_ordered', Interval.backward(x1))
    pm.Potential('y2', pm.Uniform.dist(0, 1).logp(x2)+Interval.jacobian_det(x1))
    
    trtest = pm.sample(5000, init='adapt_diag')

_, ax = plt.subplots(1, 2, figsize=(10, 5))
for ivar, varname in enumerate(trtest.varnames):
    ax[ivar].scatter(trtest[varname][:, 0], trtest[varname][:, 1], alpha=.01)
    ax[ivar].set_title(varname)
plt.tight_layout();

@junpenglao
Interesting. This is what I just did to circumvent the interval transform (note that I’ve put back our version of Jacobian det):

 import numpy as np
import matplotlib.pyplot as plt
import pymc3 as pm
import theano
import theano.tensor as tt
import pickle
# ## Init
# Init the env, load the generated data and true params
import os


#model params:
N = 2
xl = 0
xu = 1

# class Identity_transform(pm.distributions.transforms:

class Ordered(pm.distributions.transforms.ElemwiseTransform):
    name = "ordered"

    def forward(self, x):
        out = tt.zeros(x.shape)
        out = tt.inc_subtensor(out[0], x[0])
        out = tt.inc_subtensor(out[1:], tt.log(x[1:] - x[:-1]))
        return out

    def forward_val(self, x, point=None):
        x, = pm.distributions.distribution.draw_values([x], point=point)
        return self.forward(x)

    def backward(self, y):
        out = tt.zeros(y.shape)
        out = tt.inc_subtensor(out[0], y[0])
        out = tt.inc_subtensor(out[1:], tt.exp(y[1:]))
        return tt.cumsum(out)

    def jacobian_det(self, y):
        return tt.sum(y[1:])

class Identity(pm.distributions.transforms.ElemwiseTransform):
    name = "identity"

    def forward(self, x):
        return x

    def forward_val(self, x, point=None):
        x, = pm.distributions.distribution.draw_values([x], point=point)
        return self.forward(x)

    def backward(self, y):
        return y

    def jacobian_det(self, y):
        return tt.as_tensor_variable(0.) #This is ln|J|

class Composed(pm.distributions.transforms.Transform):
    def __init__(self, trafo1, trafo2):
        self._trafo1 = trafo1
        self._trafo2 = trafo2
        self.name = '_'.join([trafo1.name, trafo2.name])

    def forward(self, x):
        return self._trafo2.forward(self._trafo1.forward(x))

    def forward_val(self, x, point=None):
        return self.forward(x)

    def backward(self, z):
        return self._trafo1.backward(self._trafo2.backward(z))

    def jacobian_det(self, z):
        y = self._trafo2.backward(z)
        det1 = self._trafo1.jacobian_det(y)
        det2 = self._trafo2.jacobian_det(z)
        return det1 + det2



triangle = pm.Model()
with triangle:
    """
    This is to test the assertion that rescaling approach to create generative models with linear constraints produces 
    improper densities
    Knowledge: 
    1) x[0],x[1] \in [xl,xu]
    2) x[0] < x[2]
    """
    #the_transform = Composed(pm.distributions.transforms.Interval(xl,xu),Ordered())
    #the_transform = Composed(pm.distributions.transforms.Interval(xl,xu),Identity())
    #the_transform = pm.distributions.transforms.Interval(xl,xu)
    
    the_transform = Composed(Identity(),Ordered())
    x_naive = pm.Uniform('x',
                         lower = xl,
                         upper = xu,
                         shape = N,
                         transform=the_transform,
                         testval=[0.4,0.6])

with triangle:
    trace = pm.sample(draws = 30000,chains = 1, njobs = 1)

pm.traceplot(trace)
plt.show()

y = trace['x']
fig,ax = plt.subplots(figsize =(20,15))
plt.plot(y[:,0],y[:,1],'k.')
plt.savefig('triangle.png')
plt.show()

pass

And got this:

With jacobian det from ElemwiseTransform the density is uniform as it should be

1 Like

Yeah I experimented with it a bit further yesterday, and the Jacob, backward/forward all seems correct. So maybe it is something to do with creating a Composed transformed RV internally that something has been computed incorrectly.

1 Like

I finally figure out what went wrong in the composed implementation, and it is a bit more difficult than I thought.
The problem is that, in composed jacobian_det, you can combine element-wise operation with vector operation, however, the jacobian_det for vector operation is 1 dimension less, and it got broadcasted to the element-wise jacobian_det.
Unforturnately, fixing it is a bit more difficult, as we need to figure out a robust way to collapse the dimension (where the ordered is performed) - still thinking of a good way to do so.

2 Likes

Congrats!!