Deterministic Theano Op - Gradient

I am having trouble defining the gradient of a deterministic random variable.
The covariance matrix in my model depends on a parameter tau. I am given a precomputed covariance matrix which contains floats, +infs and -infs. Then I want to replace the +inf entries with 0, the -inf entries with tau and other entries x with tau - x.
To use NUTS, I tried to implement this as a Theano Op:

class CovMatrixOp(th.gof.Op):
    itypes = [tt.dscalar]
    otypes = [tt.dmatrix]
    def __init__(self, *args, **kwargs):
        super(CovMatrixOp, self).__init__(*args, **kwargs)

    def perform(self, node, inputs, outputs):
        tau = inputs[0]
        def helper(cell):
            if cell == np.Inf:
                return 0
            if cell == np.NINF:
                return tau
            return tau - cell
        outputs[0][0] = np.vectorize(helper)(covmPrecomputed)

    def grad(self, inputs, output_gradients):
        tau = inputs[0] # not even necessary
        def helper(cell):
            if cell == np.Inf:
                return 0
            if cell == np.NINF:
                return 1
            return 1
        return [np.vectorize(helper)(covmPrecomputed)]

In the model context, I then call:
covm = CovMatrixOp()(0.1+tau_)
and use this as the covariance matrix of a MvNormal.

However, initializing NUTS fails and when I use MAP, it says that the gradient is not available.
I’d be very helpful for any suggestions! :slight_smile:

Since the covariance matrix is precomputed, instead of a theano Ops maybe it is easier to generate a index matrix outside of the pymc3 model, and assign 0, tau, tau-x to different index inside of the model using tt.inc_subtensor

Thank you for your suggestion!
Could you give me a hint as to how I could create such an index matrix?
Since Metropolis converges slowly on my dataset, I’d like to use NUTS, but I find it difficult to get started with Theano.

I would try something like:

import numpy as np
import pymc3 as pm
import theano
import theano.tensor as tt

cov = np.vstack([np.asarray([-np.inf, -np.inf, np.inf]), 
                 np.random.randn(2, 3)])
print(cov)
inf_mask = cov == np.inf
neg_inf_mask = cov == -np.inf
finite_mask = np.isfinite(cov)

cov[inf_mask] = 0.
print(cov)

cov = theano.shared(cov)
with pm.Model() as m:
    tau = pm.Normal('tau', testval=1.) # <== set a test value here to check result
    cov = tt.set_subtensor(cov[neg_inf_mask], tau)
    cov = tt.set_subtensor(cov[finite_mask], tau - cov[finite_mask])
    
print(cov.tag.test_value)
1 Like

Thank you, that was very helpful! I managed to get it to work with slight modifications (still got some errors). I found that Theano requires .nonzero() for Boolean indexing to work.

import numpy as np
import pymc3 as pm
import theano
import theano.tensor as tt

cov = np.vstack([np.asarray([-np.inf, -np.inf, np.inf]), 
             np.random.randn(2, 3)])
print(cov)
inf_mask = cov == np.inf
neg_inf_mask = cov == -np.inf
finite_mask = np.isfinite(cov)

print(cov)

cov = theano.shared(cov)
with pm.Model() as m:
    tau = pm.Normal('tau', testval=1.) # <== set a test value here to check result
    cov = tt.set_subtensor(cov[neg_inf_mask.nonzero()], tau)
    cov = tt.set_subtensor(cov[finite_mask.nonzero()], tau - cov[finite_mask.nonzero()])
    cov = tt.set_subtensor(covm[inf_mask.nonzero()], 0)

print(cov.tag.test_value)
1 Like