Non Stationary GP Regression


#1

Hi, I am implementing non-stationary GP regression using Gibbs kernel. The lengthscale and signal covariance are also GP.

with pm.Model() as model:
    # Mean
    mean = pm.gp.mean.Zero()

# Noise
noise_obs = pm.Normal('noise_obs', mu=0, sd=1)

# Prior for hyperparameters
l_l = pm.Normal('l_l', mu=0, sd=1)
l_sig = pm.Normal('l_sig', mu=0, sd=1)
signal_cov_l = pm.Normal('signal_cov_l', mu=0, sd=1)
signal_cov_sig = pm.Normal('signal_cov_sig', mu=0, sd=1)

# Covariance functions for Level 2
cov_l =  signal_cov_l**2 * pm.gp.cov.ExpQuad(1, l_l)
cov_sig = signal_cov_sig**2 * pm.gp.cov.ExpQuad(1, l_sig)

# Level 2
gp_l = pm.gp.Marginal(mean_func=mean, cov_func=cov_l)
gp_sig = pm.gp.Marginal(mean_func=mean, cov_func=cov_sig)

# Covariance functions for Level 1
cov_obs = gp_sig**2 * pm.gp.cov.Gibbs(1,gp_l)

# Level 1
gp_obs = pm.gp.Marginal(mean_func=mean, cov_func=cov_obs)

# Level 0
y_obs = gp_obs.marginal_likelihood("y_obs", X=Xtrain[:,None], y=ytrain, noise=noise_obs)

marginal_post = pm.find_MAP()

I am getting the error: unsupported operand type(s) for ** or pow(): ‘Marginal’ and ‘int’ for covariance function in level 1. I do understand it.

My question is: Is there a workaround to this problem or will I have to make my own custom implementation? The model needs to be the way it is so I can’t change it for now.

Thanks and Regards!
Rohan


#2

You’re on the right track. The mean function and the variance both have to be gp.Latent. Try:

# Level 2, instead of Marginal
gp_l = pm.gp.Latent(...)
gp_sig = pm.gp.Latent(...)

# and then actually construct the GP
lengthscale = gp_l.prior("lengthscale", X=Xtrain[:,None])
variance = gp_sig.prior("variance", X=Xtrain[:,None])     

# Level 1
cov_obs = tt.outer(variance, variance) * pm.gp.cov.Gibbs(1, lengthscale)

You can check the docs to read about the difference between Marginal and Latent. In short, Marginal assumes the observation noise is Gaussian, and the GP f is integrated out,

p(y | X) = \int p(y \mid f) p(f \mid X) df

where, Latent makes no assumptions about the observation noise, and f isn’t integrated out. So that assumption applies to gp_obs, but not gp_sig or gp_l.


#3

Hi,

Thanks for your answer @bwengals

First of all, the modification didn’t work because Gibbs covariance function requires a callable function.

I tried making other changes like replacing lengthscale with the gp_l and variance with gp_sig but then the error would be 'Latent' object has no attribute 'ndim' which is occuring due to the variance product in cov_obs.

I also have two questions:

  1. Why are we taking outer product of covariance?

  2. Why do we need pm.gp.Latent? I understand there is no additive noise but then in the documentation, it is used when the data is not normally distributed.

Thanks,
Rohan


#4

Oh, sorry about the error. I’m traveling and didn’t have a chance to actually run the code. Gibbs takes a callable for the lengthscale_func, so you’ll need to do something like lengthscale_func = lambda X: lengthscale. For the variance, you can also take a look at ScaledCov in this PR. The cov_func argument would be your Gibbs covariance. The scaling_func argument works like the Gibbs lengthscale argument.

On point 1, you’re multiplying a stationary covariance function by a scaling function, mathematically this is

k(x, x') = \sigma^2(x) k_{gibbs}(x, x') \sigma^2(x') = \sigma^2(x) \sigma^2(x') \odot k_{gibbs}(x, x')

where \sigma^2(x) is the variance function (a GP in your case), and \odot is the Hadamard (elementwise) product. \sigma^2(x) \sigma^2(x') is an outer product. You may need to constrain it to be non-negative, by using e.g. \exp(\sigma^2) instead.

On point 2, you need to use gp.Latent whenever you don’t have a GP observed with Gaussian noise in the observations. So that means non-Normal observations, or your case where the lengthscale and variance are GPs.


#5

Thanks for your reply! I will check it and get back with results!