Defining custom multivariate Distribution with gradients

Hello,

I am trying to sample from a custom distribution whose derivatives I also compute myself.
So far I learned that I have to define my own Theano Op, but I have problems matching the types.

    import pymc3 as pm
    import theano
    import numpy as np

    class myOp(theano.gof.Op):
        itypes=[theano.tensor.dvector]
        otypes=[theano.tensor.dscalar]

        #def make_node(self, v):
            #v = theano.tensor.as_tensor_variable(v)
            #return theano.Apply(self, [v], [v.type()])

        def __init__(self):
            self.logp = lambda x: x
            self.dlogp = lambda x: x

        def perform(self, node, inputs, outputs):
            outputs[0] = self.logp(inputs[0])

        def grad(self, inputs, g):
            return [g[0] * self.dlogp(inputs[0])]

    class myDist(pm.distributions.Continuous):
        def __init__(self, *args, **kwargs):
            self.op = myOp()
            super(myDist, self).__init__(*args, **kwargs)
        def logp(self, x):
            return self.op(x)

    with pm.Model() as model:
        mydist = myDist('m', testval=np.random.rand(3))

This returns

    TypeError: We expected inputs of types '[TensorType(float64, vector)]' but got types '[TensorType(float64, scalar)]'

Right now logl and dlogl are just the identity, but I want to replace them by some custom code. Do I really need to define a second Op for the derivative, as suggested in this post?

The easiest is if you can rewrite the likelihood as a theano function, then the gradient is defined automatically. Otherwise, yes you need to define a Op for the derivative, for more see: http://docs.pymc.io/advanced_theano.html.

The likelihood is based on an external solver, so the pure theano way is not possible. I will go for the second Op then.

But do you have any advice on how to fix the type-issues (accepting a vector)?

I now realized that logp(self, x) is being passed the nodes name as value x:

import pymc3 as pm

class myDist(pm.Continuous):
    def __init__(self, **kwargs):
        super(myDist, self).__init__(**kwargs)
    def logp(self, x):
        print('value:', x)
with pm.Model() as model:
    myDist(name='likelihood', testval=0)

returns

value likelihood
[...]
AsTensorError: ('Cannot convert None to TensorType', <class 'NoneType'>)

In the end logp represents a posterior (up to normalizing constant) that I want to sample from, but the data/priors will be hardcoded in the myOp definitions. So I just want to sample from the multinomial distribution myDist (I guess its an unobserved variable in this context).

Do I have to define/instantiate it differently for this to work?

So I had many problems in my old code, but I’m halfway done now.

If theres any interest I’ll keep you updated.

4 Likes

That would be great if you can update us. Very interesting topic.