Okay I think you were right and this is essentially the issue:
def print_value_theano(value, wrt):
tt.printing.Print('value')(
value
)
tt.printing.Print('grad')(
tt.jacobian(value.flatten(), wrt)
)
print('\n')
with pm.Model() as model_test:
beta = pm.Exponential(
'beta',
lam=0.5,
testval=0.5
)
value = tt.concatenate(([0.], [beta]))
print_value_theano(
value**0.9,
beta
)
value = np.array([0., 1])*beta
# The gradient of this is all NAN!!
print_value_theano(
value**0.9,
beta
)
arr = np.array([0., 1])
value = tt.switch(arr>0, arr*beta,0)
print_value_theano(
value**0.9,
beta
)
Which prints out
value __str__ = [0. 0.53588673]
grad __str__ = [0. 0.96459612]
value __str__ = [0. 0.53588673]
grad __str__ = [nan nan]
value __str__ = [0. 0.53588673]
grad __str__ = [0. 0.96459612]
The problem now is that even with the third version with switch (which only keeps the non-nan gradients) PyMC3 seems to refuse to sample, and I can’t use the first version because the zeros are actually in a big array that I can’t construct on the fly.