How to make a TensorFlow ML model compatible with PyMC3?

You will need to use some code involving tf.gradients / tf.GradientTape (see more about these here). The blog post I linked to above uses tf.gradients, I think your surrest bet would be to mimic that code. The idea here is that your ML model is just some function F: \Theta \to \mathbb{R}^d. Your grad method needs to compute \frac{\partial F}{\partial \theta} which is what you can get with tf.gradients.

Maybe this will make the grad Op a little less daunting. Here’s a bare-bones example of a grad Op that I’ve used before:

class GradOp(Op):

    itypes = [tt.dscalar, tt.dvector] # First input is theta, second is for gradient
    otypes = [tt.dscalar] 

    def __init__(self,  nn_model):
        self.nn_model = nn_model

    def perform(self, node, inputs, outputs):
        theta, g  = inputs
        result = np.float64(np.squeeze(self.nn_model.derivative(theta)))
        outputs[0][0] = np.dot(result, g)

The difference between this and the grad Op in that post is that in reality getting the derivative in TensorFlow isn’t as simple as using something like nn_model.derivative(). The grad Op from the post is:

class _TensorFlowGradOp(tt.Op):
    """A custom Theano Op defining the gradient of a TensorFlowOp
    
    Args:
        base_op (TensorFlowOp): The original Op
    
    """
    def __init__(self, base_op):
        self.base_op = base_op
        
        # Build the TensorFlow operation to apply the reverse mode
        # autodiff for this operation
        # The placeholder is used to include the gradient of the
        # output as a seed
        self.dy = tf.placeholder(tf.float64, base_op.output_shape)
        self.grad_target = tf.gradients(base_op.target,
                                        base_op.parameters,
                                        grad_ys=self.dy)

        # This operation will take the original inputs and the gradient
        # seed as input
        types = [_to_tensor_type(shape) for shape in base_op.shapes]
        self.itypes = tuple(types + [_to_tensor_type(base_op.output_shape)])
        self.otypes = tuple(types)
 
    def infer_shape(self, node, shapes):
        return self.base_op.shapes

    def perform(self, node, inputs, outputs):
        feed_dict = dict(zip(self.base_op.parameters, inputs[:-1]),
                         **self.base_op._feed_dict)
        feed_dict[self.dy] = inputs[-1]
        result = self.base_op.session.run(self.grad_target, feed_dict=feed_dict)
        for i, r in enumerate(result):
            outputs[i][0] = np.array(r)

You can think of the extra code in the __init__ and perform methods as just the extra overhead you need to do to actually fetch the gradients from TensorFlow.

p.s. If you are able to share your code through something like a Google Collab notebook I’d be happy to help you get this working.

1 Like