Defining custom multivariate Distribution with gradients

Hello,

I am trying to sample from a custom distribution whose derivatives I also compute myself.
So far I learned that I have to define my own Theano Op, but I have problems matching the types.

    import pymc3 as pm
    import theano
    import numpy as np

    class myOp(theano.gof.Op):
        itypes=[theano.tensor.dvector]
        otypes=[theano.tensor.dscalar]

        #def make_node(self, v):
            #v = theano.tensor.as_tensor_variable(v)
            #return theano.Apply(self, [v], [v.type()])

        def __init__(self):
            self.logp = lambda x: x
            self.dlogp = lambda x: x

        def perform(self, node, inputs, outputs):
            outputs[0] = self.logp(inputs[0])

        def grad(self, inputs, g):
            return [g[0] * self.dlogp(inputs[0])]

    class myDist(pm.distributions.Continuous):
        def __init__(self, *args, **kwargs):
            self.op = myOp()
            super(myDist, self).__init__(*args, **kwargs)
        def logp(self, x):
            return self.op(x)

    with pm.Model() as model:
        mydist = myDist('m', testval=np.random.rand(3))

This returns

    TypeError: We expected inputs of types '[TensorType(float64, vector)]' but got types '[TensorType(float64, scalar)]'

Right now logl and dlogl are just the identity, but I want to replace them by some custom code. Do I really need to define a second Op for the derivative, as suggested in this post?

The easiest is if you can rewrite the likelihood as a theano function, then the gradient is defined automatically. Otherwise, yes you need to define a Op for the derivative, for more see: http://docs.pymc.io/advanced_theano.html.

The likelihood is based on an external solver, so the pure theano way is not possible. I will go for the second Op then.

But do you have any advice on how to fix the type-issues (accepting a vector)?

I now realized that logp(self, x) is being passed the nodes name as value x:

import pymc3 as pm

class myDist(pm.Continuous):
    def __init__(self, **kwargs):
        super(myDist, self).__init__(**kwargs)
    def logp(self, x):
        print('value:', x)
with pm.Model() as model:
    myDist(name='likelihood', testval=0)

returns

value likelihood
[...]
AsTensorError: ('Cannot convert None to TensorType', <class 'NoneType'>)

In the end logp represents a posterior (up to normalizing constant) that I want to sample from, but the data/priors will be hardcoded in the myOp definitions. So I just want to sample from the multinomial distribution myDist (I guess its an unobserved variable in this context).

Do I have to define/instantiate it differently for this to work?

So I had many problems in my old code, but I’m halfway done now.

If theres any interest I’ll keep you updated.

That would be great if you can update us. Very interesting topic.