Custom operation for likelihood and cache value of likelihood

Probably I can do the following to avoid evaluating the same Y for twice.

  1. Wrap the intermediate calculations Y into an operation without gradient.
  2. Create an operation A without a gradient, which calculates the log-likelihood log(X) using the operation Y.
  3. Create an operation B without a gradient, which calculates the gradient of the log-likelihood (1 / X) * ∂X/∂θ using the operation Y again.
  4. Create an operation C with gradient for the log-likelihood and its gradient. Operation C depends on operation A and B.

But, when I define a custom theano operation, I have the impression that the methods of my custom operation seem to take numpy array instead of theano tensors as arguments. While I imagine theano tensors are associated with a hash in the computation graph, the numpy arrays would not have this hash.

Can my strategy really avoid evaluating the computation in operation Y for twice?

How often does NUTS ask for the log-likelihood and for the gradient?