I am trying to understand how latent GP are implemented in PyMC, my immediate goal is to reproduce the computations shown here with PyMV version 5.0.
After some minor syntax changes from PyMC3 to PyMC, as the full latent model samples (which aparently will take a while), I started playing with this example, mentioned in this post..
I tried to “translate” it to PyMC as follows:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import pymc as pm
#import theano
#import theano.tensor as tt
np.random.seed(400)
# draw sample from GP (f_true) as the true poisson mean (mu_true) as a function of x_true
l_true = 1.0
eta_true = 1.0
x_true = np.linspace(0, 15, 900)
cov = eta_true**2 * pm.gp.cov.ExpQuad(1, l_true)
f_true = np.random.multivariate_normal(np.zeros(len(x_true)), cov(x_true[:,None]).eval(), 1).flatten()
mu_true = np.exp(f_true)
# make data set with different density of observed data in different places
idx1 = np.random.randint(0,250,250)
x1 = x_true[idx1]
y1 = np.random.poisson(mu_true[idx1])
idx2 = np.random.randint(250,350,10)
x2 = x_true[idx2]
y2 = np.random.poisson(mu_true[idx2])
idx3 = np.random.randint(500,700,180)
x3 = x_true[idx3]
y3 = np.random.poisson(mu_true[idx3])
idx4 = np.random.randint(700,900,10)
x4 = x_true[idx4]
y4 = np.random.poisson(mu_true[idx4])
x = np.concatenate((x1, x2, x3, x4), axis=0)
idx = np.argsort(x)
x = x[idx]
y = np.concatenate((y1, y2, y3, y4), axis=0)[idx]
plt.figure(figsize=(16,6))
plt.plot(x_true, mu_true, 'y', label="True GP (Poisson mean)")
plt.plot(x, y, 'ko', label="Observed data", alpha=0.7);
plt.legend();
n = len(x)
X = x[:,None]
print("Number of data points:", n)
with pm.Model() as model:
l = pm.Gamma("l", alpha=2, beta=1)
eta = pm.HalfNormal("eta", sigma=1)
cov = eta**2 * pm.gp.cov.ExpQuad(1, l)
gp = pm.gp.Latent(cov_func=cov)
f = gp.prior("f", X=X)
# y_ = pm.Poisson("y", mu=tt.exp(f), observed=y)
y_ = pm.Poisson("y", mu=pm.math.exp(f), observed=y)
tr = pm.sample(1000, chains=1)
## so far so good
Xnew = np.linspace(-1, 20, 200)[:,None]
with model:
fnew = gp.conditional("fnew", Xnew=Xnew)
full_preds = pm.sample_posterior_predictive(tr, vars=[fnew])
and here I get the error
ValueError: Variable name fnew already exists.
which leaves me baffled (sorry i am total beginner here), as the whole workflow seems to me identical to some GP examples I found, e.g. first link above, " Homoskedastic GP"section.
Any hints please? Thanks a lot