Gaussian process is very sensitive to the choice of prior. Which means that when you change your observation scale, you should adjust the scale of your prior accordingly. I recommanded you to take a look at Mike Betancount’s tutorial, the issue you are describing is mostly what he’s explaining in https://betanalpha.github.io/assets/case_studies/gp_part2/part2.html, but you should also take a look at part 1 and part 3.
For example, in your case you could modify the prior of the parameters as following:
# A one dimensional column vector of inputs.
X = x[:, None]
with pm.Model() as model_reg:
# hyperprior for lengthscale kernel parameter
ℓ = pm.Gamma('ℓ', 1, 0.5)
# instanciate a covariance function
cov = pm.gp.cov.ExpQuad(1, ls=ℓ)
# instanciate a GP prior
gp = pm.gp.Marginal(cov_func=cov)
# prior
ϵ = pm.HalfNormal('ϵ', .25)
# likelihood
y_pred = gp.marginal_likelihood('y_pred', X=X, y=y, noise=ϵ)
trace_reg = pm.sample(2000)
@aloctavodia this might be a good example for you to put into the 2nd (or now 3rd?) edition of your book.
Thanks, I appreciate your help.
Setting epsilon to pm.HalfNormal(‘ϵ’, .25) did the trick. I have also noticed that in the original case, with np.sin(x)*5, there were divergences during the sampling.