I am studying Gaussian Processes for the first time on the second edition of “Bayesian Analysis with Python” by Osvaldo Martin.
In the example titled “Gaussian Process Regression” in this Jupyter Notebook https://github.com/PacktPublishing/Bayesian-Analysis-with-Python-Second-Edition/blob/master/Chapter07/07_Gaussian%20process.ipynb
everything seems to work fine as long as the max values of the sin function are close to 1 in magnitude.
If I replace y = np.random.normal(np.sin(x), 0.1) with y = np.random.normal(np.sin(x)*5, 0.1) here is what happens:
If I set the noise argument of gp.marginal_likelihood to one I get something that looks much better (unfortunately I can only upload one image)
Do you have an explanation for that?
Gaussian process is very sensitive to the choice of prior. Which means that when you change your observation scale, you should adjust the scale of your prior accordingly. I recommanded you to take a look at Mike Betancount’s tutorial, the issue you are describing is mostly what he’s explaining in https://betanalpha.github.io/assets/case_studies/gp_part2/part2.html, but you should also take a look at part 1 and part 3.
For example, in your case you could modify the prior of the parameters as following:
# A one dimensional column vector of inputs.
X = x[:, None]
with pm.Model() as model_reg:
# hyperprior for lengthscale kernel parameter
ℓ = pm.Gamma('ℓ', 1, 0.5)
# instanciate a covariance function
cov = pm.gp.cov.ExpQuad(1, ls=ℓ)
# instanciate a GP prior
gp = pm.gp.Marginal(cov_func=cov)
ϵ = pm.HalfNormal('ϵ', .25)
y_pred = gp.marginal_likelihood('y_pred', X=X, y=y, noise=ϵ)
trace_reg = pm.sample(2000)
@aloctavodia this might be a good example for you to put into the 2nd (or now 3rd?) edition of your book.
There is not going to be a third edition, but maybe I can add this to the repo as extra material. And/or it could be part of our book
Thanks, I appreciate your help.
Setting epsilon to pm.HalfNormal(‘ϵ’, .25) did the trick. I have also noticed that in the original case, with np.sin(x)*5, there were divergences during the sampling.