Gaussian Processes - combining constant features with covariance

arkottke · November 30, 2017, 11:37pm

I am trying to do the following:

f = K1((x1, y1), (x1, y1)*) + K2((x2, y2), (x2, y2)*) + x4 * K3((x1, y1), (x1, y1)*)

where K1, K2, and K3 are kernels that use two dimensions each (x and y). The third kernel (K3) needs to be scaled by x3. I thought I would perform this by K4(x4) * K3. However, I don’t see that I can specify the active_dims in the constant kernel.

Here my code if my above explanation wasn’t clear:

# Features contains rows of [lon_eq, lat_eq, lon_sta, lat_sta, dist_jb]
X = features

with pm.Model() as model:

def make_gp(name, active_dims):
    theta = pm.HalfCauchy(f'theta_{name}', 5)
    pi = pm.HalfCauchy(f'pi_{name}', 5)
    rho = pm.HalfCauchy(f'rho_{name}', 5)
    
    cov = theta * pm.gp.cov.Exponential(ls=rho, input_dim=5, active_dims=active_dims) + pi
    gp = pm.gp.MarginalSparse(cov_func=cov, approx="FITC")
    
    return {
        'theta': theta,
        'pi': pi,
        'rho': rho,
        'cov': cov,
        'gp': gp
    }

    parts = {}
    parts['eq'] = make_gp('eq', [0, 1])
    parts['sta'] = make_gp('sta', [2, 3])
    parts['dist_jb'] = make_gp('dist_jb', [0, 1])

    # How to combine parts['dist_jb']['gp'] with 'dist_jb', active_dims=4
    gp = parts['eq']['gp'] + parts['sta']['gp'] 

    # Model error
    sigma_noise  = pm.HalfNormal("sigma_noise",  sd=1)

    # initialize inducing points with K-means
    Xu = pm.gp.util.kmeans_inducing_points(200, X)

    y = gp.marginal_likelihood("y", X=X, Xu=Xu, y=resids, sigma=sigma_noise)

junpenglao · December 1, 2017, 9:22am

[Edit]: You specify the active_dims for a constant kernel the same way as the Exponential kernel in your code (it is filling the off-diagonal of the cov matrix with the same value).

arkottke · December 1, 2017, 3:54pm

That’s what I was thinking, but the Constant class doesn’t currently support that:

def __init__(self, c):
    super(Constant, self).__init__(1, None)
    self.c = c

https://github.com/pymc-devs/pymc3/blob/2cb6f79481f9130526e33cb679003d659868c4aa/pymc3/gp/cov.py#L155

junpenglao · December 1, 2017, 6:48pm

It should be fine because it is inherit from the base cov class https://github.com/pymc-devs/pymc3/blob/master/pymc3/gp/cov.py#L36

arkottke · December 1, 2017, 10:20pm

Yeah, I think that __init__() would just have to be modified to permit passing the input_dim and active_dims arguments.

bwengals · December 8, 2017, 11:32pm

I may be mistaken, but shouldn’t the Constant cov function not depend on input_dim or active_dims? It should just add a constant across the whole matrix?

K = k(x, x') + c

That’s why input_dim and active_dims aren’t able to be set. Do you get the covariance structure you’re after this way? If the covariance function is scaled by a constant, you can just do

K = c*k(x, x')

without using the Constant covariance object.

arkottke · December 16, 2017, 7:06pm

@bwengals you are correct about applying a single constant. However, I am trying to scale the covariance matrix based on another variable. If you are interested, the process in described in Bussas et al. (2017). I don’t fully understand the process yet, but working through it.

bwengals · December 17, 2017, 1:35am

Oh ok, I think I get it now. How about this? You can combine covariance functions (in general, not talking about pymc3 specifically) like this:

f(x)k(x, x')f(x')

where f(x) is some scaling function of x or maybe other parameters.

But in your case, f(x) is just x4, right? Here is a class that you can use to do this. I haven’t tested it carefully so, keep an eye on it. This really should be added to the codebase…

class ScalingFunc(pm.gp.cov.Covariance):
    def __init__(self, input_dim, sc_func, args=None, active_dims=None):
        super(ScalingFunc, self).__init__(input_dim, active_dims)
        self.sc_func = pm.gp.cov.handle_args(sc_func, args)
        self.args = args
  
    def full(self, X, Xs=None):
        X, Xs = self._slice(X, Xs) 
        sc_x = self.sc_func(tt.as_tensor_variable(X), self.args)
        if Xs is None:
            return tt.outer(sc_x, sc_x)
        else:
            sc_xs = self.sc_func(tt.as_tensor_variable(Xs), self.args)
            return tt.outer(sc_x, sc_xs)
    def diag(self, X): 
        return tt.square(self.sc_func(tt.as_tensor_variable(X), self.args))

And a simple example:

X = np.linspace(0, 10, 100)[:, None]
sc_func = lambda X: X

k3 = pm.gp.cov.ExpQuad(1, 1)
k4 = ScalingFunc(1, sc_func)
k = k3*k4
m=plt.imshow(k(X).eval());plt.colorbar(m);

The scaling function, sc_func, is used the same way as the function inputs for
WarpedInput and Gibbs, so to generalize check out the docs on those.

bwengals · December 17, 2017, 2:01am

Actually, for your specific k4 this is equivalent to multiplying k3 by Linear I believe

arkottke · December 18, 2017, 6:42pm

That was the conclusion that I reached as well. Now I just have to test that it is working…

Thanks for all of your help.

bwengals · December 21, 2017, 1:42am

Of course! Glad to help

Topic		Replies	Views
Gaussian Process regression with categorical features Questions	1	4875	June 13, 2019
Combine latent with observable variables for Latent Gaussian Processes version agnostic gaussian_process	1	44	October 21, 2024
Gaussian Processes: Sampling kernel hyperparameters? Questions	2	859	January 15, 2021
Passing a static matrix with covariance kernel for pm.gp.LatentKron Questions gaussian_process	2	477	March 29, 2022
Batched Gaussian Processes Questions	3	560	August 6, 2019

Gaussian Processes - combining constant features with covariance

Related topics