Computing the gradient of the likelihood function wrt hyperpriors in a hierarchical model

Hi everyone, this is my first post so please tell me if anything is wrong with the format or if I need to provide some code, but my question is conceptual for now. I want to implement the stochastic Hamiltonian Monte Carlo (based on the provided sgmcmc), it almost works for a simple model, but my case of interest is a hierarchical model. I want to compute the gradient of the likelihood function with respect to all model parameters, this works for the priors as they are direct input to the likelihood function, but I can’t seem to get the gradient wrt the hyperpriors, as I get a MissingInputError error, for now I use this to compute the gradient of the sum of the likelihood of each data point, as you can see I have suppressed the disconnected_inputs, thus the gradient for the hyperparameters is always zero. I’m not sure how to proceed.

sum_ = tt.sum(model.observed_RVs[0].logp_elemwiset, axis =0)
gradients = []
for var in model.vars:
    gradients.append(theano.grad(sum_, var,  disconnected_inputs='ignore'))
grad_final = [gradients[i].flatten() for i in range(len(gradients)) ]
dlogLsum = theano.clone(tt.concatenate(grad_final, axis=0), flat_view.replacements, strict=True)