I am with a Gaussian Process (indeed coming from the question https://discourse.pymc.io/t/help-with-a-gaussian-process-for-very-few-data/9818/4). If the arrays to use are
X = np.array([32, 38, 94, 83, 99, 78])
y = np.array([1702, 1514, 683, 269, 900, 86])
y = (y - np.mean(y)) / np.std(y)
X = (X - np.mean(X))/np.std(X)
my code is (I copy it because now I’m using PyMC instead of v3):
with pm.Model() as d_model:
# Specify the covariance function.
vert = 1*pm.HalfNormal("vert", sigma=1)
l = pm.HalfNormal(name='l', sigma=1)
cov_func = vert**2 * pm.gp.cov.Matern32(1, ls=l)
# Specify the GP. The default mean function is zero.
#gp = pm.gp.Marginal(mean_init, cov_func=cov_func)
gp = pm.gp.Marginal(cov_func=cov_func)
# Place a GP prior over the function f and do the noise:
sigma = pm.HalfNormal("sigma", sigma=1)
y_ = gp.marginal_likelihood("y_", X=X, y=y.flatten(), noise=sigma)
# MCMC:
trace = pm.sample(10000, chains=3, tune=1000, target_accept=0.99)
And now sampling the posterior predictive comes the error:
X_new = np.linspace(-1.5, 1.5, 1500)
X_new = X_new.reshape(-1,1)
with d_model:
f_p = gp.conditional("f_p", X_new)
pred_samples = pm.sample_posterior_predictive(trace, samples=10, keep_size=False)
I get: ValueError: conflicting sizes for dimension 'chain': length 1 on the data but length 3 on coordinate 'chain'
. Note that due to the very slow sample_posterior_predictive
(for this very same problem it was quite faster in PyMC3) I am only sampling for 10 draws (which needs keep_size=False
).
Previous error can be fixed with return_inferencedata=False
in sample_posterior_predictive
, but that seems like going back to v3 and I doubt it’s the true solution. Thx.