I think your issue here is priors and the scaling of the data. Your vert
parameter controls the “amplitude” of the GP. For your data it should probably be on the order of 100 - 1000s. sigma
controls the noise, so this is also probably in 100’s maybe. I’d recommend rescaling your y
data, since this also helps with numerical stability, and then looking a bit more carefully at the priors for vert
and sigma
too.
Especially for such small data, using strongly informative priors becomes much more important! It will help with divergences too.
Here’s a modified version of your code that gives a more sensible output that might help you get started:
X = np.array([32, 38, 94, 83, 99, 78])
y = np.array([1702, 1514, 683, 269, 900, 86])
y_sc = (y - np.mean(y)) / np.std(y)
with pm.Model() as d_model:
# Specify the covariance function
vert = pm.HalfNormal("vert", sigma=1.0)
l = pm.Gamma(name='l', alpha=7, beta=1)
cov_func = vert**2 * pm.gp.cov.Matern32(1, ls=l)
# Specify mean
c = pm.HalfNormal('c', sigma=1.0)
mean_init = pm.gp.mean.Constant(c=c)
# Specify the GP
gp = pm.gp.Marginal(cov_func=cov_func)
# Place a GP prior over the function and do the noise:
sigma = pm.HalfNormal("sigma", sigma=1)
y_ = gp.marginal_likelihood("y_", X=X[:, None], y=y_sc, noise=sigma)
# MCMC:
trace = pm.sample(500, chains=2, tune=1000, target_accept=0.95)
fig = plt.figure(figsize=(15, 5))
ax = fig.gca()
f_p = pred_samples.posterior_predictive['f_p'].stack(samples=['chain', 'draw'])
pm.gp.util.plot_gp_dist(ax, f_p.values.T, X_new.flatten());
plt.plot(X, y_sc, 'o', color='dodgerblue', alpha=0.5, markersize=20);
I used PyMC v4 though, so it’ll need a couple small changes to work with PyMC3.