Gaussian Processes: Sampling kernel hyperparameters?

jbuddy_13 · January 6, 2021, 2:56pm

From the official documentation on PyMC3’s Gaussian Process implementation(s), and more specifically, the Marginal class:

# A one dimensional column vector of inputs.
X = np.linspace(0, 1, 10)[:, None]

with pm.Model() as model:
    # Specify the covariance function.
    cov_func = pm.gp.cov.ExpQuad(1, ls=0.1)

    # Specify the GP.  The default mean function is `Zero`.
    gp = pm.gp.Marginal(cov_func=cov_func)

    # Place a GP prior over the function f.
    sigma = pm.HalfCauchy("sigma", beta=3)
    y_ = gp.marginal_likelihood("y", X=X, y=y, noise=sigma)

...

# After fitting or sampling, specify the distribution
# at new points with .conditional
Xnew = np.linspace(-1, 2, 50)[:, None]

with model:
    fcond = gp.conditional("fcond", Xnew=Xnew)

I see that a prior is placed over sigma, the error term, and that’s it. The kernel basically decides how squiggly/smooth your function should be, yet priors have not been placed on the kernel terms sigma_f and ls. So now I’m wondering, is explicitly modeling the kernel hyperparameters an atypical design choice? (Perhaps this was not included for the sake of tutorial brevity.)

Out of curiosity, how were 1 and 0.1 selected as Kernel hyperparameters? I’m uncertain on the ranges that they should take on. (Naively, I might assume both sigma_f and ls should fall within range [0,1] in the RBF/ExpQuad kernel.)

I ask these questions as it’s my understanding that both the Prior and Posterior distributions in a Gaussian Process are distributions over functions. And so it’s my (naive) belief that pm.gp.cov.ExpQuad(1, ls=0.1) is just one such function, not a distribution of functions.

Could anyone clarify?

jbuddy_13 · January 15, 2021, 12:18am

And… crickets!

Martin_Ingram · January 15, 2021, 1:07am

Hi there,

I wasn’t involved in designing the example, but I have worked with GPs quite a lot, so I may have some comments for you regarding your questions:

Fixing the parameters for the covariance function is unusual, in my opinion. I expect that this was just a choice made to keep the documentation succinct, as you suggest. I can’t think of a compelling reason for the two values chosen apart from them being convenient example values. Unless you have a very good reason for fixing them, you would probably want to do inference over them too. There seem to be longer tutorials that you might like which cover this: Gaussian Process Regression — PyMC3 3.1rc3 documentation
Regarding the question of whether fixing the hyperparameters means that the GP is no longer a distribution over functions, I believe the answer to that is no: this is still a distribution over functions. It’s just that it’s a distribution that places more prior mass on functions that have a similar lengthscale and variance to the one specified. You can see that there is still a lot of variation in the functions that are consistent with this prior if you sample from it:

import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt

X = np.linspace(0, 1, 100)[:, None]

with pm.Model() as model:
    # Specify the covariance function.
    cov_func = pm.gp.cov.ExpQuad(1, ls=0.1)

    # Specify the GP.  The default mean function is `Zero`.
    gp = pm.gp.Latent(cov_func=cov_func)

    # Place a GP prior over the function f.
    f = gp.prior("f", X=X)

    prior_checks = pm.sample_prior_predictive(samples=50, random_seed=3)

plt.plot(X,  prior_checks['f'].T)
plt.title('Draws from GP')
plt.gcf().tight_layout()

The last line should produce something like this:

So you can see that even this prior is still consistent with a whole range of possible functions. Hope this answers your questions, let me know!

Topic		Replies	Views
Vanilla implementation of Gaussian Process in PyMC3? Questions	16	1104	January 31, 2025
Scaling covariance function in Gaussian Processes Questions	3	737	October 28, 2019
Posterior predictive checks with Gaussian Process	4	1294	May 26, 2022
Parameter sampling v3 gaussian_process , prior	3	493	March 28, 2022
How to put a restriction on the value of hyper-parameters in Gaussian process Questions	4	846	October 30, 2020

Gaussian Processes: Sampling kernel hyperparameters?

Related topics