 # Gp.conditional returns unconstraint distribution?

Hello!

I am struggling to extrapolate my guassian process to points where I don’t have data. I have the following toy gaussian process to fit the deviation of a function (DH_gp) with respect its mean (H_arr):

with pm.Model() as model:
ℓ = pm.Gamma("ℓ", alpha=2, beta=2)
η = pm.HalfNormal("η", sigma=0.001)
gp_cov = η ** 2 * pm.gp.cov.ExpQuad(1, ℓ) + pm.gp.cov.WhiteNoise(1e-4)
gp = pm.gp.Latent(cov_func=gp_cov)

DH_gp = gp.prior("DH_gp", X=z_arr[:, None])

lkl = pm.MvNormal('lkl', mu=tt.as_tensor_variable(DH_gp+H_arr), cov=H_cov, observed=H_data)
trace = pm.sample(50, chains=1, cores=1, return_inferencedata=True, tune=100)
mp = pm.find_MAP()


which gives me:

# plot the results
fig = plt.figure(figsize=(12, 5))
ax = fig.gca()

# plot the samples from the gp posterior with samples and shading
from pymc3.gp.util import plot_gp_dist

plot_gp_dist(ax, trace.posterior["DH_gp"][0, :, :], z_arr[:, None])

# plot the data and the true latent function
#ax.plot(z_arr[:, None], H_arr, "dodgerblue", lw=3, label=r'$H-<H>/\sigma_H$')
ax.errorbar(z_arr[:, None], H_data-H_arr, yerr=sig_H_arr, fmt="ok", ms=3, label="Corrected data")

# axis labels and title
plt.xlabel("X")
plt.ylabel("DH(z)")
plt.title("z")


Now if I try to make a conditional distribution:

with model:
DH_pred = gp.conditional('DH_pred', z_new)
# Sample from the GP conditional distribution
with model:
pred_samples = pm.sample_posterior_predictive(trace.posterior, var_names=['DH_pred'],
samples=2000)


I get the following completely unconstraint distribution for the guassian process:

# plot the results
fig = plt.figure()
ax = fig.gca()

# plot the samples from the gp posterior with samples and shading
from pymc3.gp.util import plot_gp_dist

plot_gp_dist(ax, pred_samples["DH_pred"], z_new)

# plot the data and the true latent function
ax.errorbar(z_arr[:, None], H_data-H_arr, yerr=sig_H_arr, fmt="ok", ms=3, label="Observed data")

# axis labels and title
plt.xlabel("z")
plt.ylabel(r'$DH$')
plt.xlim([0, 3]); Note that it doesn’t even return the constraints it preivously got in the first plot. I have tried trying to figure this one out for days now but the examples of gp.predict in examples notebook seem to be for an older version of pymc3 with a different syntax.

I would really appreciate any help as I am new to pymc3 and I am very excited about its implementation of guassian process.

Many thanks and looking forward the soon release of Pymc4

if you want to replicate the error here are the data:
H_cov.txt (7.9 KB) H_arr.txt (450 Bytes) H_data.txt (450 Bytes)

Last updated: Wed Mar 17 2021

Python implementation: CPython
Python version       : 3.8.5
IPython version      : 7.20.0

numpy     : 1.20.1
scipy     : 1.6.1
theano    : 1.1.2
pyccl     : 2.2.0
pymc3     : 3.11.2
matplotlib: 3.3.4
arviz     : 0.11.2
pandas    : 1.2.2

Watermark: 2.2.0