# Gaussian Processes: Sampling kernel hyperparameters?

From the official documentation on PyMC3’s Gaussian Process implementation(s), and more specifically, the Marginal class:

``````# A one dimensional column vector of inputs.
X = np.linspace(0, 1, 10)[:, None]

with pm.Model() as model:
# Specify the covariance function.

# Specify the GP.  The default mean function is `Zero`.
gp = pm.gp.Marginal(cov_func=cov_func)

# Place a GP prior over the function f.
sigma = pm.HalfCauchy("sigma", beta=3)
y_ = gp.marginal_likelihood("y", X=X, y=y, noise=sigma)

...

# After fitting or sampling, specify the distribution
# at new points with .conditional
Xnew = np.linspace(-1, 2, 50)[:, None]

with model:
fcond = gp.conditional("fcond", Xnew=Xnew)
``````

I see that a prior is placed over sigma, the error term, and that’s it. The kernel basically decides how squiggly/smooth your function should be, yet priors have not been placed on the kernel terms `sigma_f` and `ls`. So now I’m wondering, is explicitly modeling the kernel hyperparameters an atypical design choice? (Perhaps this was not included for the sake of tutorial brevity.)

Out of curiosity, how were 1 and 0.1 selected as Kernel hyperparameters? I’m uncertain on the ranges that they should take on. (Naively, I might assume both sigma_f and ls should fall within range [0,1] in the RBF/ExpQuad kernel.)

I ask these questions as it’s my understanding that both the Prior and Posterior distributions in a Gaussian Process are distributions over functions. And so it’s my (naive) belief that `pm.gp.cov.ExpQuad(1, ls=0.1)` is just one such function, not a distribution of functions.

Could anyone clarify?

And… crickets!

Hi there,

I wasn’t involved in designing the example, but I have worked with GPs quite a lot, so I may have some comments for you regarding your questions:

• Fixing the parameters for the covariance function is unusual, in my opinion. I expect that this was just a choice made to keep the documentation succinct, as you suggest. I can’t think of a compelling reason for the two values chosen apart from them being convenient example values. Unless you have a very good reason for fixing them, you would probably want to do inference over them too. There seem to be longer tutorials that you might like which cover this: Gaussian Process Regression — PyMC3 3.1rc3 documentation
• Regarding the question of whether fixing the hyperparameters means that the GP is no longer a distribution over functions, I believe the answer to that is no: this is still a distribution over functions. It’s just that it’s a distribution that places more prior mass on functions that have a similar lengthscale and variance to the one specified. You can see that there is still a lot of variation in the functions that are consistent with this prior if you sample from it:
``````import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt

X = np.linspace(0, 1, 100)[:, None]

with pm.Model() as model:
# Specify the covariance function.

# Specify the GP.  The default mean function is `Zero`.
gp = pm.gp.Latent(cov_func=cov_func)

# Place a GP prior over the function f.
f = gp.prior("f", X=X)

prior_checks = pm.sample_prior_predictive(samples=50, random_seed=3)

plt.plot(X,  prior_checks['f'].T)
plt.title('Draws from GP')
plt.gcf().tight_layout()
``````

The last line should produce something like this:

So you can see that even this prior is still consistent with a whole range of possible functions. Hope this answers your questions, let me know!

3 Likes