Hi all, in a different thread I summarised the issues I had when trying to translate my stan code to pymc. To investigate what’s going wrong, I decided to take a step back and try to reproduce the pymc examples of marginal and latent GP formulations. However, my posteriors are quite poor, as you can see below:
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pymc as pm
# set the seed
np.random.seed(1)
n = 100 # The number of data points
X = np.linspace(0, 10, n)[:, None] # The inputs to the GP, they must be arranged as a column vector
# Define the true covariance function and its parameters
ℓ_true = 1.0
η_true = 3.0
cov_func = η_true**2 * pm.gp.cov.Matern52(1, ℓ_true)
# A mean function that is zero everywhere
mean_func = pm.gp.mean.Zero()
# The latent function values are one sample from a multivariate normal
# Note that we have to call `eval()` because PyMC3 built on top of Theano
f_true = np.random.multivariate_normal(
mean_func(X).eval(), cov_func(X).eval() + 1e-8 * np.eye(n), 1
).flatten()
# The observed data is the latent function plus a small amount of IID Gaussian noise
# The standard deviation of the noise is `sigma`
σ_true = 2.0
y = f_true + σ_true * np.random.randn(n)
## Plot the data and the unobserved latent function
fig = plt.figure(figsize=(12, 5))
ax = fig.gca()
ax.plot(X, f_true, "dodgerblue", lw=3, label="True f")
ax.plot(X, y, "ok", ms=3, alpha=0.5, label="Data")
ax.set_xlabel("X")
ax.set_ylabel("The true f(x)")
plt.legend();
with pm.Model() as model_marginal:
ℓ = pm.Gamma("ℓ", alpha=2, beta=1)
η = pm.HalfCauchy("η", beta=5)
cov = η**2 * pm.gp.cov.Matern52(1, ℓ)
gp = pm.gp.Marginal(cov_func=cov)
σ = pm.HalfCauchy("σ", beta=5)
y_ = gp.marginal_likelihood("y", X=X, y=y, sigma=σ)
idata_marginal = pm.sample(1000, tune=1000, chains=2, cores=1)
az.summary(idata_marginal, var_names = ["ℓ", "η", "σ"], round_to=2)
the output is:
mean | sd | hdi_3% | hdi_97% | mcse_mean | mcse_sd | ess_bulk | ess_tail | r_hat | |
---|---|---|---|---|---|---|---|---|---|
ℓ | 1.20 | 0.32 | 0.66 | 1.78 | 0.01 | 0.01 | 1100.90 | 802.66 | 1.0 |
η | 4.51 | 1.63 | 2.30 | 7.31 | 0.05 | 0.04 | 1144.79 | 1067.48 | 1.0 |
σ | 1.93 | 0.15 | 1.66 | 2.21 | 0.00 | 0.00 | 1732.24 | 1234.32 | 1.0 |
As you can see, the mean estimate of η (4.51) is roughly 1sd away from the true value (3.00). I was expecting the estimate to be quite a lot better. The results from the latent formulation are comparably bad. Interestingly, the MAP estimates (not included above) are better than the MCMC sample estimates.
Am I doing something wrong, or is this the level of agreement you would expect?