Non Stationary GP Regression

Hi, I am implementing non-stationary GP regression using Gibbs kernel. The lengthscale and signal covariance are also GP.

with pm.Model() as model:
    # Mean
    mean = pm.gp.mean.Zero()

# Noise
noise_obs = pm.Normal('noise_obs', mu=0, sd=1)

# Prior for hyperparameters
l_l = pm.Normal('l_l', mu=0, sd=1)
l_sig = pm.Normal('l_sig', mu=0, sd=1)
signal_cov_l = pm.Normal('signal_cov_l', mu=0, sd=1)
signal_cov_sig = pm.Normal('signal_cov_sig', mu=0, sd=1)

# Covariance functions for Level 2
cov_l =  signal_cov_l**2 * pm.gp.cov.ExpQuad(1, l_l)
cov_sig = signal_cov_sig**2 * pm.gp.cov.ExpQuad(1, l_sig)

# Level 2
gp_l = pm.gp.Marginal(mean_func=mean, cov_func=cov_l)
gp_sig = pm.gp.Marginal(mean_func=mean, cov_func=cov_sig)

# Covariance functions for Level 1
cov_obs = gp_sig**2 * pm.gp.cov.Gibbs(1,gp_l)

# Level 1
gp_obs = pm.gp.Marginal(mean_func=mean, cov_func=cov_obs)

# Level 0
y_obs = gp_obs.marginal_likelihood("y_obs", X=Xtrain[:,None], y=ytrain, noise=noise_obs)

marginal_post = pm.find_MAP()

I am getting the error: unsupported operand type(s) for ** or pow(): ‘Marginal’ and ‘int’ for covariance function in level 1. I do understand it.

My question is: Is there a workaround to this problem or will I have to make my own custom implementation? The model needs to be the way it is so I can’t change it for now.

Thanks and Regards!
Rohan

You’re on the right track. The mean function and the variance both have to be gp.Latent. Try:

# Level 2, instead of Marginal
gp_l = pm.gp.Latent(...)
gp_sig = pm.gp.Latent(...)

# and then actually construct the GP
lengthscale = gp_l.prior("lengthscale", X=Xtrain[:,None])
variance = gp_sig.prior("variance", X=Xtrain[:,None])     

# Level 1
cov_obs = tt.outer(variance, variance) * pm.gp.cov.Gibbs(1, lengthscale)

You can check the docs to read about the difference between Marginal and Latent. In short, Marginal assumes the observation noise is Gaussian, and the GP f is integrated out,

p(y | X) = \int p(y \mid f) p(f \mid X) df

where, Latent makes no assumptions about the observation noise, and f isn’t integrated out. So that assumption applies to gp_obs, but not gp_sig or gp_l.

1 Like

Hi,

Thanks for your answer @bwengals

First of all, the modification didn’t work because Gibbs covariance function requires a callable function.

I tried making other changes like replacing lengthscale with the gp_l and variance with gp_sig but then the error would be 'Latent' object has no attribute 'ndim' which is occuring due to the variance product in cov_obs.

I also have two questions:

  1. Why are we taking outer product of covariance?

  2. Why do we need pm.gp.Latent? I understand there is no additive noise but then in the documentation, it is used when the data is not normally distributed.

Thanks,
Rohan

Oh, sorry about the error. I’m traveling and didn’t have a chance to actually run the code. Gibbs takes a callable for the lengthscale_func, so you’ll need to do something like lengthscale_func = lambda X: lengthscale. For the variance, you can also take a look at ScaledCov in this PR. The cov_func argument would be your Gibbs covariance. The scaling_func argument works like the Gibbs lengthscale argument.

On point 1, you’re multiplying a stationary covariance function by a scaling function, mathematically this is

k(x, x') = \sigma^2(x) k_{gibbs}(x, x') \sigma^2(x') = \sigma^2(x) \sigma^2(x') \odot k_{gibbs}(x, x')

where \sigma^2(x) is the variance function (a GP in your case), and \odot is the Hadamard (elementwise) product. \sigma^2(x) \sigma^2(x') is an outer product. You may need to constrain it to be non-negative, by using e.g. \exp(\sigma^2) instead.

On point 2, you need to use gp.Latent whenever you don’t have a GP observed with Gaussian noise in the observations. So that means non-Normal observations, or your case where the lengthscale and variance are GPs.

1 Like

Thanks for your reply! I will check it and get back with results!