Experimenting a bit, I tried dispensing with the function compiles and just returning the symbolic tensor objects (symbolic parameter vector theta is now a member variable of the model):
def build_statespace_graph (self) -> None:
self.update(self.theta)
states, covariances, log_likelihood = self.kalman_filter.build_graph(self.data, *self.unpack_statespace())
self.f_loglike = log_likelihood
self.f_loglike_grad = at.grad(log_likelihood, self.theta)
self.f_y_hat = states
self.f_cov_hat = covariances
The problem I run into here is how to hand these un-complied functions to the PyMC model. Both of the examples I found that are close to what I’m trying to do (custom blackbox likelihood function and custom distribution) use python code wrapped in an Op, so that’s how I ended up with a compiled Aesara function in an op.
I can’t call model.f_loglike as if it were a function inside an Op like:
def perform(self, node, inputs, outputs):
(theta,) = inputs
outputs[0][0] = model.f_loglike(theta)
This obviously throws an error because f_loglike is a tensor variable, not a function. On the other hand, if I just try to give the log likelihood function to the potential function directly, pm.Potential("likelihood", model.f_loglike), I get a MissingInput error:
MissingInputError: Input 0 (theta) of the graph (indices start from 0), used to compute Subtensor{int64}(theta, ScalarConstant{6}), was not provided and not given a value.
Just giving the likelihood function doesn’t end up with the MCMC draws being assigned to theta automatically (why would they be!)
On the other hand, all of the functions evaluate correctly when I test them using them e.g. model.f_y_hat.eval({model.theta:np.random.uniform(size=9)}) , so it seems the graph is correctly defined up until I need to link it up to the sampler. I am clearly missing a piece of the puzzle with respect to how aesara and PyMC interact: in what contexts I need to define a custom Op and which I don’t; when I need to provide compiled code and when I can provide symbolic tensors. I’d be grateful for some guidance.
EDIT:
Ok, I think I figured it out. I just needed to use the parameter vector from the PyMC model as the “theta” rather than declaring a separate tensor variable and trying to fill it in later. So I assign the priors, flatten and concatenate them, then hand that to the model.build_statespace_graph method, which calls model.update(theta) and finally Kalman filter build method. Then I can hand the model.loglikelihood, model.states and model.covariance to PyMC directly.
This eliminates all the need for Ops, as well as the need to ask for the gradients of the log-likelihood myself. Nice!