[PyMCon Web Series 03a] Introduction to Hilbert Space GPs (HSGPs) in PyMC (Mar 15, 2023) (Bill Engels)

PyMC Web Series: Introduction to Hilbert Space GPs in PyMC: A fast Gaussian process approximation that you can actually use

Speaker: Bill Engels (@bwengals)

Event type: Live webinar
Date: March 15 2023 (subscribe here for email updates)
Time: 22:00 UTC
Register for the live webinar on Meetup to get the Zoom link


We thank our sponsors for supporting PyMC and the PyMCon Web Series. If you would like to sponsor us, contact us for more information.

Mistplay is the #1 Loyalty Program for mobile gamers - with over 20 million users worldwide. Millions of gamers use our platform to discover games, connect with friends, and earn awesome rewards. We are a fast growing profitable company, recently ranked as the 3rd fastest growing technology company in Canada. Our passion to innovation drives our growth across the industry with the development of new apps, powerful ad tech tools, and the recent launch of a publishing division for mobile games.

Mistplay is hiring for a Senior Data Scientist (Remote or Montreal,QC).


Welcome to another event in the PyMCon Web Series! As part of this series, most events will have one or more asynchronous components and a at least one live, synchronous component.

Recording of Bill’s synchronous talk

Bill’s Interview

In this case, Bill has provided:

  • A set resources for those looking for a background on Gaussian processes: see here
  • A notebook including some pre-made approaches to data analysis for you to try out on your own data. These will include both Gaussian processes, but also related approaches (regression, B-splines, etc.): see here

In addition, approximately a week after Bill’s synchronous talk, we will host a special edition of PyMC Office Hours specifically focused on Gaussian processes of all flavors: see here

Abstract of the talk

Gaussian processes (GPs) are a versatile tool in the Bayesian modelers toolbox – in theory. In practice, for all but the smallest data sets, one needs to resort to approximations to actually fit GPs in any reasonable amount of time. There are currently a few GP approximations implemented in PyMC based on inducing points that are fast, but only apply when the likelihood is Gaussian. The Hilbert Space Gaussian Process (HSGP) approximation works well with any likelihood and scales as O(nm + m). In this talk I’ll introduce a PyMC HSGP implementation and show via case studies how it fills a few key gaps in the PyMC GP library: fast GPs as model subcomponents, and fast GPs with non-Gaussian likelihoods. I’ll also cover tips and tricks for applying HSGPs effectively in practice.


In preparation for @bwengals’s live talk (and following up on Danh Phan’s recent PyMCon Web Series event), we would like to present some background material on Gaussian processes (GPs). If you are interested in GPs, but perhaps don’t feel quite ready for the more advanced topics @bwengals will cover, we hope these materials may provide a bit of preparation.

If you have further questions, you can ask in this thread. However, we will also be holding a special edition of PyMC Office Hours specifically focused on Gaussian processes of all flavors (simple and advanced). So it may be easier to ask one of our GP experts your question at that time. Details of those office hours will be posted in this thread once they are finalized.


The final component of this PyMC Web Series event will be a special edition of PyMC office hours with a specific focus on Gaussian processes. @bwengals will be there to answer questions. But we will also be joined by recent PyMCon Web Series speaker @DanhPhan who recently presented about Multi-Output Gaussian Processes. As a special guest, we will also have PyMC BDFL, @fonnesbeck . All three are well-versed in both GPs and PyMC. But if you have questions about Gaussian processes, do not miss this !

Registration is available on Meetup.

This special GP-focused edition of office hours will be held on March 22nd (or 23rd depending on your time zone) at the time listed below. Office hours will last about an hour, so don’t worry if you can’t make it at exactly this time!

UTC - 20:00 (March 22nd)
New York - 6pm (March 22nd)
Seattle - 3pm (March 22nd)
Sydney - 9am (March 23rd)
Tokyo - 7am (March 23rd)

Office hours will be held on Zoom.
Meeting link: Launch Meeting - Zoom
Meeting ID: 872 9690 4620
Passcode: 957697

Please note that participants are expected to abide by PyMC’s Code of Conduct.


Is there are difference between pm.set_data and the conditional method for this GP? Does this behavior different from the other existing GPs?


Summarizing Bill’s answer, the pm.set_data() approach is easy/straightforward, but only available for HSGPs. For other GPs you need to use the conditional approach. Thanks for the question @wdeanHPA !


Can I sample >one function< from the posterior and then deterministically evaluate that function on new X ?

…to run an optimizer on that function

…it’s not possible with GPs, but my prior is that this approximation allows it (like the RFF approximation which is very similar)


Stan user here. Do I understand that constrained prior finds the parametric prior with 95% highest posterior density between lower and upper?


I believe that is correct. The documentation for that function is here: pymc.find_constrained_prior — PyMC dev documentation Let me know if that doesn’t answer your question.

1 Like

That’s correct. You can find more info here: How to find priors intuitively - YouTube and here: pymc.find_constrained_prior — PyMC 5.1.2 documentation


Hey, two questions:

  1. Do you have some benchmarks comparing GPs to HSGPs?
  2. Do HSGPs only work for Matern covariance functions? And since I don’t use GPs a lot, I have to ask whether this would be a limitation or not.

(I was under the impression that the quadExp covariance is a special case of Matern)


Question for Bill.
How fast is it to generate the basis set given X and kernel hparams?
Can this conversion be released as a standalone funnction outside of pymc?
If you want to optimize the hparams, can you just optimize the marginal likelihood
of the corresponding induced linear model, ignoring the GP?


Fantastic overview of HSGPs. You hit on all the pain points of using GPs for me. Can’t wait to get the slides and notebook. Thank you!


Thank you for the great talk! I was wondering, is it possible to tack on an arbitrary mean function onto a HSGP in the same way that you would for a vanilla GP? Thinking of your example here, where you add a piecewise linear mean onto the existing GP.

1 Like

Hello! May I ask when HSGP will be officially launched? Are there any learning examples available now?

Thanks Christian that’s right. Want to add that you can use .conditional with HSGPs too, so both work.

  1. Nope I don’t, but you’ll find some here: [2004.11408] Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming
  2. Yep, ExpQuad is part of the Matern family. These are just what’s implemented for PyMC at this point. The requirement is that the kernel is stationary. More info in Sec 4.2.1 here: https://gaussianprocess.org/gpml/chapters/RW.pdf

Not sure, it’ll be part of the next release. As of now, there’s not other PyMC HSGP material out there – working on some for the next event and for the docs. If you want to give it a try sooner, the examples in the docstrings of HSGP.prior, HSGP.conditional and HSGP.prior_linearized

@ScroobiusTrip I wish I’d mentioned this in the talk. You don’t really need to worry about whether something is specifically a mean function if you’re using HSGPs. Just add the (mean zero) HSGP to the piecewise linear part of your model – this is equivalent to using the piecewise linear model as the mean function within a GP. For the rest of the GP implementations in PyMC, you need to at least wrap things into a Mean function object to keep prediction via gp.conditional works correctly.

For your case though, just add them! If you use HSGP.prior_linearized you can use pm.set_data and it’ll work for both parts of the model.


  • It’s \mathcal{O}(m^2 n) to generate the basis (m basis functions, n data points). But, you don’t need to know anything about the kernel hyperparams or even the kernel to do it.
  • No plans now, but the implementation of the tricky parts is here.
  • I think so! f = phi @ (beta * sqrt_psd), with each \beta_i \sim N(0, 1). So you should be able to calculate the marginal likelihood integrating out beta just like one could with a vanilla Bayesian linear regression model. Sorry, I didn’t quite get this live.

Does a function like this do the trick?

import pymc as pm
import pytensor.tensor as pt

def one_gp_function(X_new, beta_posterior_sample, cov_func, L, m, X_mean):
    Xs = X_new - X_mean
    L = pt.as_tensor_variable(L)
    eigvals = pm.gp.hsgp_approx.calc_eigenvalues(L, m, tl=pt)
    phi = pm.gp.hsgp_approx.calc_eigenvectors(Xs, L, eigvals, m, tl=pt)

    omega = pt.sqrt(eigvals)
    psd = cov_func.power_spectral_density(omega).eval()
    f = phi @ (beta_posterior_sample * pt.sqrt(psd))
    return f

I’m sure this could be cleaner, but I think one “function” from the GP posterior will correspond to one draw from the posterior of beta. Then you can regenerate the phi basis given arbitrary X and that’s your function.


Thank you for your response! I have installed pymc-devs from Github to try HSGP. It’s really outstanding work! I have two questions: (1) Is using HSGP.prior_linearized faster than the regular HSGP.prior when using multiple HSGPs? I prefer the latter for its simplicity and ease of use. (2) When using HSGP, is there a speed advantage to using pymc.sampling.jax.sample_numpyro_nuts?

For (1), it should be if your GPs have the same size, so you can stack the coefficients and do a matrix matrix multiplication instead of several matrix vector multiplications. For (2), it usually is for me.