From the official documentation on PyMC3’s Gaussian Process implementation(s), and more specifically, the Marginal class:
# A one dimensional column vector of inputs.
X = np.linspace(0, 1, 10)[:, None]
with pm.Model() as model:
# Specify the covariance function.
cov_func = pm.gp.cov.ExpQuad(1, ls=0.1)
# Specify the GP. The default mean function is `Zero`.
gp = pm.gp.Marginal(cov_func=cov_func)
# Place a GP prior over the function f.
sigma = pm.HalfCauchy("sigma", beta=3)
y_ = gp.marginal_likelihood("y", X=X, y=y, noise=sigma)
...
# After fitting or sampling, specify the distribution
# at new points with .conditional
Xnew = np.linspace(-1, 2, 50)[:, None]
with model:
fcond = gp.conditional("fcond", Xnew=Xnew)
I see that a prior is placed over sigma, the error term, and that’s it. The kernel basically decides how squiggly/smooth your function should be, yet priors have not been placed on the kernel terms sigma_f
and ls
. So now I’m wondering, is explicitly modeling the kernel hyperparameters an atypical design choice? (Perhaps this was not included for the sake of tutorial brevity.)
Out of curiosity, how were 1 and 0.1 selected as Kernel hyperparameters? I’m uncertain on the ranges that they should take on. (Naively, I might assume both sigma_f and ls should fall within range [0,1] in the RBF/ExpQuad kernel.)
I ask these questions as it’s my understanding that both the Prior and Posterior distributions in a Gaussian Process are distributions over functions. And so it’s my (naive) belief that pm.gp.cov.ExpQuad(1, ls=0.1)
is just one such function, not a distribution of functions.
Could anyone clarify?
1 Like
Hi there,
I wasn’t involved in designing the example, but I have worked with GPs quite a lot, so I may have some comments for you regarding your questions:
- Fixing the parameters for the covariance function is unusual, in my opinion. I expect that this was just a choice made to keep the documentation succinct, as you suggest. I can’t think of a compelling reason for the two values chosen apart from them being convenient example values. Unless you have a very good reason for fixing them, you would probably want to do inference over them too. There seem to be longer tutorials that you might like which cover this: Gaussian Process Regression — PyMC3 3.1rc3 documentation
- Regarding the question of whether fixing the hyperparameters means that the GP is no longer a distribution over functions, I believe the answer to that is no: this is still a distribution over functions. It’s just that it’s a distribution that places more prior mass on functions that have a similar lengthscale and variance to the one specified. You can see that there is still a lot of variation in the functions that are consistent with this prior if you sample from it:
import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
X = np.linspace(0, 1, 100)[:, None]
with pm.Model() as model:
# Specify the covariance function.
cov_func = pm.gp.cov.ExpQuad(1, ls=0.1)
# Specify the GP. The default mean function is `Zero`.
gp = pm.gp.Latent(cov_func=cov_func)
# Place a GP prior over the function f.
f = gp.prior("f", X=X)
prior_checks = pm.sample_prior_predictive(samples=50, random_seed=3)
plt.plot(X, prior_checks['f'].T)
plt.title('Draws from GP')
plt.gcf().tight_layout()
The last line should produce something like this:
So you can see that even this prior is still consistent with a whole range of possible functions. Hope this answers your questions, let me know!
4 Likes