I am having trouble defining the gradient of a deterministic random variable.
The covariance matrix in my model depends on a parameter tau. I am given a precomputed covariance matrix which contains floats, +infs and -infs. Then I want to replace the +inf entries with 0, the -inf entries with tau and other entries x with tau - x.
To use NUTS, I tried to implement this as a Theano Op:
class CovMatrixOp(th.gof.Op):
itypes = [tt.dscalar]
otypes = [tt.dmatrix]
def __init__(self, *args, **kwargs):
super(CovMatrixOp, self).__init__(*args, **kwargs)
def perform(self, node, inputs, outputs):
tau = inputs[0]
def helper(cell):
if cell == np.Inf:
return 0
if cell == np.NINF:
return tau
return tau - cell
outputs[0][0] = np.vectorize(helper)(covmPrecomputed)
def grad(self, inputs, output_gradients):
tau = inputs[0] # not even necessary
def helper(cell):
if cell == np.Inf:
return 0
if cell == np.NINF:
return 1
return 1
return [np.vectorize(helper)(covmPrecomputed)]
In the model context, I then call:
covm = CovMatrixOp()(0.1+tau_)
and use this as the covariance matrix of a MvNormal.
However, initializing NUTS fails and when I use MAP, it says that the gradient is not available.
I’d be very helpful for any suggestions!
Since the covariance matrix is precomputed, instead of a theano Ops maybe it is easier to generate a index matrix outside of the pymc3 model, and assign 0, tau, tau-x to different index inside of the model using tt.inc_subtensor
Thank you for your suggestion!
Could you give me a hint as to how I could create such an index matrix?
Since Metropolis converges slowly on my dataset, I’d like to use NUTS, but I find it difficult to get started with Theano.
I would try something like:
import numpy as np
import pymc3 as pm
import theano
import theano.tensor as tt
cov = np.vstack([np.asarray([-np.inf, -np.inf, np.inf]),
np.random.randn(2, 3)])
print(cov)
inf_mask = cov == np.inf
neg_inf_mask = cov == -np.inf
finite_mask = np.isfinite(cov)
cov[inf_mask] = 0.
print(cov)
cov = theano.shared(cov)
with pm.Model() as m:
tau = pm.Normal('tau', testval=1.) # <== set a test value here to check result
cov = tt.set_subtensor(cov[neg_inf_mask], tau)
cov = tt.set_subtensor(cov[finite_mask], tau - cov[finite_mask])
print(cov.tag.test_value)
1 Like
Thank you, that was very helpful! I managed to get it to work with slight modifications (still got some errors). I found that Theano requires .nonzero() for Boolean indexing to work.
import numpy as np
import pymc3 as pm
import theano
import theano.tensor as tt
cov = np.vstack([np.asarray([-np.inf, -np.inf, np.inf]),
np.random.randn(2, 3)])
print(cov)
inf_mask = cov == np.inf
neg_inf_mask = cov == -np.inf
finite_mask = np.isfinite(cov)
print(cov)
cov = theano.shared(cov)
with pm.Model() as m:
tau = pm.Normal('tau', testval=1.) # <== set a test value here to check result
cov = tt.set_subtensor(cov[neg_inf_mask.nonzero()], tau)
cov = tt.set_subtensor(cov[finite_mask.nonzero()], tau - cov[finite_mask.nonzero()])
cov = tt.set_subtensor(covm[inf_mask.nonzero()], 0)
print(cov.tag.test_value)
1 Like