Differentiating a GP function


I was wondering if there is an easy way to differentiate a GP function respect to x*. I can easily do it numerically, but I want to get a posterior interval for it as well.

with pm.Model() as model:
    x_scale = 2.0 
    m_scale = 8.0
    eta1 = pm.HalfNormal("eta1", sigma=2)

    cov_func1 = eta1 ** 2 * pm.gp.cov.Matern52(input_dim=input_dim, ls=x_scale, active_dims=[0]) \
                * pm.gp.cov.Matern52(input_dim=input_dim, ls=m_scale, active_dims=[1])

    eta2 = pm.HalfNormal("eta2", sigma=2)
    cov_func2 = eta2 ** 2 * pm.gp.cov.Matern52(input_dim=input_dim, ls=x_scale, active_dims=[0]) \
                * pm.gp.cov.Matern52(input_dim=input_dim, ls=m_scale, active_dims=[1])

    gp1 = pm.gp.MarginalSparse(cov_func=cov_func1, approx="VFE")
    gp2 = pm.gp.MarginalSparse(cov_func=cov_func2, approx="VFE")

    # set flat prior for Xu
    # Xu = pm.Flat("Xu1", shape=Xu_init.shape, testval=Xu_init)

    sigma_1 = pm.HalfCauchy("sigma_1", beta=5)
    sigma_2 = pm.HalfCauchy("sigma_2", beta=5)

    y1_err_ = tt.sqrt(Ys1err * Ys1err + sigma_1 * sigma_1)
    y2_err_ = tt.sqrt(Ys2err * Ys2err + sigma_2 * sigma_2)

    y1_ = gp1.marginal_likelihood("y1", X=Xs, Xu=Xu_init, y=Ys1, noise=y1_err_)
    y2_ = gp2.marginal_likelihood("y2", X=Xs, Xu=Xu_init, y=Ys2, noise=y2_err_)

with model:
    mp = pm.find_MAP(method="BFGS")


To clarify, are you interested in automatically calculating df/dx where f \sim GP(\mu(x), K(x,x')) is a random function sampled from a GP?

Thank you for the clarification. Yes, that is correct.
I want to compute \frac{\partial f}{\partial x_i} where f \sim \mathcal{GP}(0, K({\bf x}^{\prime}, {\bf x})).

I don’t have deep familiarity with the internals of the GP module and so perhaps there is a shortcut I’m unaware of. However, section 9.4 of this book suggests that \frac{df}{dx} \sim GP( \frac{\partial \mu}{\partial x}, \frac{\partial^2K}{\partial x \partial x'}). This means that you just need to figure out what the new mean function and kernel K' would be and then create a new GP with them in your model. I think this is a fairly straightforward solution and can try to show you if you are interested. Unfortunately, it will not be a generic solution applicable to each kernel as the derivative kernel will be different for every original kernel function.


Thank you for pointing out this book. I was wondering maybe PYMC has an implementation of this (at least for differentiable Kernels - such as Martern 3/2 and 5/2 kernels. I will try to figure it out how to implement it, but if you have an example I am sure it would be super helpful.

Thank you,

That could indeed be an interesting addition to the GP module, hmmm…

1 Like