Hello!
I am struggling to extrapolate my guassian process to points where I don’t have data. I have the following toy gaussian process to fit the deviation of a function (DH_gp) with respect its mean (H_arr):
with pm.Model() as model:
ℓ = pm.Gamma("ℓ", alpha=2, beta=2)
η = pm.HalfNormal("η", sigma=0.001)
gp_cov = η ** 2 * pm.gp.cov.ExpQuad(1, ℓ) + pm.gp.cov.WhiteNoise(1e-4)
gp = pm.gp.Latent(cov_func=gp_cov)
DH_gp = gp.prior("DH_gp", X=z_arr[:, None])
lkl = pm.MvNormal('lkl', mu=tt.as_tensor_variable(DH_gp+H_arr), cov=H_cov, observed=H_data)
trace = pm.sample(50, chains=1, cores=1, return_inferencedata=True, tune=100)
mp = pm.find_MAP()
which gives me:
# plot the results
fig = plt.figure(figsize=(12, 5))
ax = fig.gca()
# plot the samples from the gp posterior with samples and shading
from pymc3.gp.util import plot_gp_dist
plot_gp_dist(ax, trace.posterior["DH_gp"][0, :, :], z_arr[:, None])
# plot the data and the true latent function
#ax.plot(z_arr[:, None], H_arr, "dodgerblue", lw=3, label=r'$H-<H>/\sigma_H$')
ax.errorbar(z_arr[:, None], H_data-H_arr, yerr=sig_H_arr, fmt="ok", ms=3, label="Corrected data")
# axis labels and title
plt.xlabel("X")
plt.ylabel("DH(z)")
plt.title("z")
Now if I try to make a conditional distribution:
with model:
DH_pred = gp.conditional('DH_pred', z_new)
# Sample from the GP conditional distribution
with model:
pred_samples = pm.sample_posterior_predictive(trace.posterior, var_names=['DH_pred'],
samples=2000)
I get the following completely unconstraint distribution for the guassian process:
# plot the results
fig = plt.figure()
ax = fig.gca()
# plot the samples from the gp posterior with samples and shading
from pymc3.gp.util import plot_gp_dist
plot_gp_dist(ax, pred_samples["DH_pred"], z_new)
# plot the data and the true latent function
ax.errorbar(z_arr[:, None], H_data-H_arr, yerr=sig_H_arr, fmt="ok", ms=3, label="Observed data")
# axis labels and title
plt.xlabel("z")
plt.ylabel(r'$DH$')
plt.xlim([0, 3]);
Note that it doesn’t even return the constraints it preivously got in the first plot. I have tried trying to figure this one out for days now but the examples of gp.predict in examples notebook seem to be for an older version of pymc3 with a different syntax.
I would really appreciate any help as I am new to pymc3 and I am very excited about its implementation of guassian process.
Many thanks and looking forward the soon release of Pymc4
if you want to replicate the error here are the data:
H_cov.txt (7.9 KB) H_arr.txt (450 Bytes) H_data.txt (450 Bytes)
Last updated: Wed Mar 17 2021
Python implementation: CPython
Python version : 3.8.5
IPython version : 7.20.0
numpy : 1.20.1
scipy : 1.6.1
theano : 1.1.2
pyccl : 2.2.0
pymc3 : 3.11.2
matplotlib: 3.3.4
arviz : 0.11.2
pandas : 1.2.2
Watermark: 2.2.0