Adding Gaussian Processes in PyMC4 this summer!

I have previously contributed to PyMC4 and I am willing to implement, test, maintain a higher-level API for Gaussian Processes in PyMC4 using TensorFlow and TensorFlow Probability and also write tutorials and notebooks with expressive animations/figures explaining their usage under GSoC 2020. I also look forward to improve the upstream projects (TensorFlow and TensorFlow Probability) to better support the GP interface of PyMC4.

I have been referring to the following papers to get a deeper understanding of Gaussian Process (but found myself wondering how to implement):

Are there any other papers I need to refer from a point of view of implementing them in PyMC4? Do I need to go through the GP API in PyMC3 to implement a similar one in PyMC4 or some other design is desired? I have started working on a proposal but need some direction regarding topics to include and focus on (like project design/timeline). I also wanted to know if it is necessary to read this book cited by PyMC3. I will shortly start working on the project on my fork once I get some direction from the mentors/maintainers.

EDIT: I have created this GitHub Repository containing the projects and my draft project proposal.

(pinging @fonnesbeck @aloctavodia)

Thank you!


Rasmussen and Williams is definitely a good introduction, so I recommend it as a starting point.

It would be nice to have a GP submodule that works like the one in PyMC3. I’m not a big fan of how GP regression works on TFP; notably, it forces you to specify a prediction grid when you instantiate a GP regression class. We would prefer something that allows a fitted GP regression instance to predict over arbitrary grids.


Thanks for the reply @fonnesbeck. I saw the TFP implementation of GP and GPR and I agree that it must use fitted models. I will get back to you with some ideas by tomorrow once I go through the implementations of the GP submodule in PyMC3 and the papers.

Hi @fonnesbeck! I have tried to add a initial design for GPs in PyMC4 based on the one in PyMC3. I have implemented ExpQuad covarience, Zero mean and LatentGP (similar to Latent in PyMC3) on this fork of the project. I was facing a few problems that I wanted to share with you:

  1. The diag parameter in Covarience class has to be handled when multiple Covarience objects are added or multiplied. We can currently do this by implementing a interface (like Combination in PyMC3) to handle the diag parameter for each kernel in the combination seperately. Or we can use the interface implemented by TFP but we would lose the control over the diag parameter. What do you suggest here? Should we go for the TFP interface or make one like PyMC3? The code is present here
  2. prior method works correctly but we need to sample from the prior distribution in order to get the conditional when given parameter is None, which is a problem for PyMC4 as we can’t yield from in between the methods. I wonder how can this be done. I would love your ideas on that! (I didn’t completly understand this line of conditional method. ) The code is present here

We might as well use the TFP conventions for combining kernels, unless there is something special we need to do. All else being equal, I think we want to push things down to Tensorflow when we can. Can you think of any reason we would want to handle diag ourselves?

Regarding the conditional, I would just make all the parameters required, at least for now.

I think tfp doesn’t provide a diag parameter that only consider the diagonal entries of evaluated covariance matrix. So, we have two choices here:

  1. Add the diag support to tfp and use its interface to combine kernels.
  2. Create a interface similar to PyMC3 to combine kernels.

(Please correct me here if I am wrong here)

Regarding the conditional, I tried it with given parameters but the covariance matrix (calculated as \Sigma_{\star\star} - \Sigma_{\star}^{T}\Sigma^{-1}\Sigma_{\star}) has negative entries on the diagonal and so the cholesky decomposition fails while constructing a reparameterized MvNormal distribution. I am trying to figure it out and will return to you as soon as I get that done!

Thanks! Making all the parameters required did solve the problem for conditional. I have prepared a simple LatentGP model based on PyMC3’s design. Though I have not added any non-regression tests, I have tested the module using the script below and it seems to work fine for both multiple feature_ndims and batches of data. The module is present here at my fork. Before I proceed further, can you please review it at your own time? @aloctavodia It would be really nice to hear your thoughts too!

import numpy as np
import pymc4 as pm

X = np.random.randn(2, 10, 2, 2)
Xnew = np.random.randn(2, 5, 2, 2)

# Let's define out GP model and its parameters
mean_fn =
cov_fn =, 1., feature_ndims=2)
gp =, cov_fn)

def gpmodel():
    prior = yield gp.prior('prior', X)
    cond = yield gp.conditional('cond', Xnew, given={'X': X, 'f': prior})
    return cond

sample_prior = pm.sample_prior_predictive(gpmodel(), sample_shape=3)
assert sample_prior.prior_predictive['gpmodel/prior'].shape == (1, 3, 2, 10)
assert sample_prior.prior_predictive['gpmodel/cond'].shape == (1, 3, 2, 5)
samples = pm.sample(gpmodel(), num_samples=3, burn_in=1, num_chains=1)
assert samples.posterior['gpmodel/prior'].shape == (1, 3, 2, 10)
assert samples.posterior['gpmodel/cond'].shape == (1, 3, 2, 5)

Happy weekend!
Tirth Patel

Hey @aloctavodia and @fonnesbeck,

I have prepared a proposal for the project here. Can you please review it and provide suggestions?

Tirth Patel

1 Like