Hi everyone,
I am trying to use a custom Theano.Op subclass in a model and I am running into errors that I think are related to applying scalar Ops to vectors. I am trying to wrap my Op in theano.tensor.Elemwise, but I have been unsuccessful so far. Below are more details of my problem and a minimal working example of what I’m trying to do.
Any help, comments, or questions would be greatly appreciated. I will be happy to provide any clarification or additional information you might need to help.
In particular, how does pymc3 makes functions defined in terms of built-in Theano Ops work with the input/output vectors required to add observed variables to a likelihood function? How does what I’m trying to do below differ from a function defined in terms of built-in Theano Ops, and how can I correct this difference?
Thanks!
Jeff
I am trying to implement a deterministic function (Fermi-Dirac integral) as part of my model, which is defined through (numerical) integration. The derivative of the function is known, however, so I can implement the gradient and use NUTS, etc. in sampling. I have written a subclass of theano.Op that compiles and runs correctly outside of pymc3. Below is the code that defines my custom Op as well as a code using the Op in a pymc3 model.
# Import libraries required for minimal working example
import numpy as np
from sympy import mpmath
import theano
from theano import tensor as tt
import pymc3
# Custom Theano.Op subclass
class F(theano.Op):
"""
Custom Theano Op using mpmath tahn-sinh quadrature integration.
$ F(j, \eta) = \int_0^{\inf}dx x^j/(1+exp(x-\eta)) $
Parameters
----------
j : scalar
Index parameter. No gradient defined for this parameter.
eta : scalar
Continuous argument of function. Derivative is defined by
$ \frac{d}{d\eta}F(j, \eta) = j F(j-1, \eta) $
Returns
-------
f : scalar
Output of integral.
"""
# These parameters define a unique Op
__props__ = ('maxdegree', 'upper_bound', 'cutoff')
# Using itypes/otypes and letting theano deal with type checking
itypes = [tt.dscalar, tt.dscalar]
otypes = [tt.dscalar]
def __init__(self, maxdegree=20, upper_bound=60., cutoff=15.):
"""
Initialize the Op with specific values of integration control parameters.
Parameters
----------
maxdegree : int, optional
Maximum degree of quadrature rule for mpmath.quad to try
before quitting. Defaults to 20.
upper_bound : float, optional
Constant added to eta to define the upper bound of the integral.
Defaults to 60.
cutoff : float, optional
Cutoff value to determine whether the integral should be
split into two pieces or not. Defaults to 15.
"""
super(F, self).__init__()
self.maxdegree = maxdegree
self.upper_bound = upper_bound
self.cutoff = cutoff
def perform(self, node, inputs, output_storage):
"""
Numerically calculate the function.
"""
# Convert inputs into types mpmath can read
j, eta = map(float, inputs)
# Define integrand
def integrand(x):
return mpmath.power(x, j)/(mpmath.exp(x - eta) + 1.)
# Define bounds of integral
if eta < 15.:
bounds = [0., eta + self.upper_bound]
else:
bounds = [0., eta, eta + self.upper_bound]
# Perform integral
integral = mpmath.quad(integrand, bounds, maxdegree=self.maxdegree)
# Set output value
output_storage[0][0] = np.array(float(integral))
def grad(self, inputs, output_gradients):
"""
Calculate the gradient.
"""
j, eta = inputs
g, = output_gradients
return [
theano.gradient.grad_undefined(F(), 0, inputs),
g*j*F()(j-1., eta),
]
# Minimal working example making use of custom Theano.Op
# 5-dimensional input, with 10 rows of data
input_data = np.random.rand(10, 5)
# 1 output value for each row of input data
output_data = np.random.rand(10)
input_data = theano.shared(input_data)
output_data = theano.shared(output_data)
example = pymc3.Model()
with example:
# model parameter priors
standard_dev = pymc3.Exponential('standard_dev', lam=1.)
coefficient_vector = pymc3.Uniform('coefficient_vector',
upper=400.,
lower=-40.,
shape=5,
)
# model
input_vector = tt.dot(input_data, coefficient_vector)
f = F()
model = (2.*f(tt.constant(1., dtype=tt.dscalar), input_vector)
/f(tt.constant(0., dtype=tt.dscalar), input_vector) - input_vector)
# likelihood
model_output = pymc3.Normal('model_output',
mu=model,
sd=standard_dev,
observed=output_data,
)
When I run this code, I get the following error from the pymc3 model:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-97255855fde9> in <module>()
25 f = F()
26 model = (2.*f(tt.constant(1., dtype=tt.dscalar), input_vector)
---> 27 /f(tt.constant(0., dtype=tt.dscalar), input_vector) - input_vector)
28
29 # likelihood
/home/Envs/pymc3/lib/python2.7/site-packages/theano/gof/op.pyc in __call__(self, *inputs, **kwargs)
613 """
614 return_list = kwargs.pop('return_list', False)
--> 615 node = self.make_node(*inputs, **kwargs)
616
617 if config.compute_test_value != 'off':
/home/Envs/pymc3/lib/python2.7/site-packages/theano/gof/op.pyc in make_node(self, *inputs)
964 raise TypeError(
965 "We expected inputs of types '%s' but got types '%s' " %
--> 966 (str(self.itypes), str([inp.type for inp in inputs])))
967 return theano.Apply(self, inputs, [o() for o in self.otypes])
968
TypeError: We expected inputs of types '[TensorType(float64, scalar), TensorType(float64, scalar)]'
but got types '[TensorType(float64, scalar), TensorType(float64, vector)]'
This error is coming from passing a vector of input data into the Op rather than the scalar data that the function is looking for. I am not sure how to properly apply my Op to each entry in the vector input_vector.
If I replace
f = F()
with
f = tt.Elemwise(F())
in the above code block, wrapping my Op in Elemwise, I get the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-9-64089d35d7db> in <module>()
25 f = tt.Elemwise(F())
26 model = (2.*f(tt.constant(1., dtype=tt.dscalar), input_vector)
---> 27 /f(tt.constant(0., dtype=tt.dscalar), input_vector) - input_vector)
28
29 # likelihood
/home/Envs/pymc3/lib/python2.7/site-packages/theano/gof/op.pyc in __call__(self, *inputs, **kwargs)
613 """
614 return_list = kwargs.pop('return_list', False)
--> 615 node = self.make_node(*inputs, **kwargs)
616
617 if config.compute_test_value != 'off':
/home/Envs/pymc3/lib/python2.7/site-packages/theano/tensor/elemwise.pyc in make_node(self, *inputs)
576 inputs = list(map(as_tensor_variable, inputs))
577 out_dtypes, out_broadcastables, inputs = self.get_output_info(
--> 578 DimShuffle, *inputs)
579 outputs = [TensorType(dtype=dtype, broadcastable=broadcastable)()
580 for dtype, broadcastable in izip(out_dtypes,
/home/Envs/pymc3/lib/python2.7/site-packages/theano/tensor/elemwise.pyc in get_output_info(self,
dim_shuffle, *inputs)
518 shadow = self.scalar_op.make_node(
519 *[get_scalar_type(dtype=i.type.dtype).make_variable()
--> 520 for i in inputs])
521
522 target_length = max([input.type.ndim for input in inputs])
/home/Envs/pymc3/lib/python2.7/site-packages/theano/gof/op.pyc in make_node(self, *inputs)
964 raise TypeError(
965 "We expected inputs of types '%s' but got types '%s' " %
--> 966 (str(self.itypes), str([inp.type for inp in inputs])))
967 return theano.Apply(self, inputs, [o() for o in self.otypes])
968
TypeError: We expected inputs of types '[TensorType(float64, scalar), TensorType(float64, scalar)]'
but got types '[Scalar(float64), Scalar(float64)]'
For comparison, here is my custom Op working correcty on scalar inputs outside of pymc3:
# Demonstration of Theano Op working outside of pymc3
j = tt.dscalar('j')
eta = tt.dscalar('eta')
f = F()(j, eta)
compiled_f = theano.function([j, eta], f)
print compiled_f(0., 4.)
>>>>4.01814992792
I have also been unable to get my custom Op to work with Elemwise() outside of pymc3.