GP kernel for polar angle dimension

I’m using the pm.gp.Latent.prior for a 2D function on polar-like coordinates (rho, theta).
The X support (shape (n_points, 2)) of coordinates features angles -pi < theta < pi as the second dimension. Therefore, I need a kernel in the theta dimension which finds the closest angle difference between two angles.

First I tried modifying the ExpQuad kernel to always take the smallest angle like so

import pymc3 as pm
import numpy as np
import theano.tensor as tt

_PI = np.pi
_2PI = 2*_PI

class StationaryAngleWrap(pm.gp.cov.Stationary):

    def square_dist(self, X, Xs):
        dist = tt.sqrt(super(StationaryAngleWrap, self).square_dist(X, Xs) + 1e-12)  # calculate differences without squares
        return tt.square(tt.switch(dist > _PI, dist - _2PI, dist))  # get smallest possible angle

class ExpQuadAngleWrap(StationaryAngleWrap, pm.gp.cov.ExpQuad):
    pass

However, this usually results in the model failing to initialize with bad initial energy. My guess is that switch complicated the gradient too much.

Therefore, I tried creating a compound kernel which tries different angle difference offsets and should give high covariance for the smallest difference.

class StationaryOffset(pm.gp.cov.Stationary):

    def __init__(self, input_dim, ls=None, ls_inv=None, active_dims=None, offset=0):
        super(StationaryOffset, self).__init__(input_dim, ls, ls_inv, active_dims)
        self._offset = offset

    def square_dist(self, X, Xs):
        return super(StationaryOffset, self).square_dist(X - self._offset, Xs)


class ExpQuadOffset(StationaryOffset, pm.gp.cov.ExpQuad):
    pass

cov_theta = (ExpQuadOffset(2, theta_o_ls, active_dims=[1]) + 
                 ExpQuadOffset(2, theta_o_ls, active_dims=[1], offset=2*np.pi) +
                 ExpQuadOffset(2, theta_o_ls, active_dims=[1], offset=-2*np.pi))

This approach mostly works, but sometimes give strange results at the -pi ~ pi discontinuiti in the coordinates.

Any suggestions how to improve either of these solutions or a different solution is welcome.

1 Like

I think I saw a solution using a periodic distribution somewhere, maybe @aseyboldt or @bwengals knows?

@junpenglao I don’t know if that is what you mean, but we talked about angles and polar coordinates here.

Yes that’s the one I have in mind. Thanks @aseyboldt!

That post deals with a transform between polar and Cartesian coordinates. However, in my problem, I need to use the polar coordinates directly, because the values are expected to be correlated mostly along a constant rho “surface”. So transforming them to cartesian coordinates won’t help me (in fact, I know the cartesian coordinates already).

The transformation is meant for cases where you work with the polar coordinates. The question is, which coordinates the sampler is supposed to see. If the sampler works with cartesian coordinates, then you can solve the problem of the wrapping around of angles, but you pay for this by introducing an additional variable per angle and some trouble around 0. If all your first coordinates are clearly quite a bit larger than 0, you might be able to those as the additional variable, but I think it is safer to just keep this separate (see the comment after the one I linked to).

I found some description of kernels for polar coordinates in this article Polar Gaussian Processes and Experimental Designs in Circular Domains.
I’m thinking of implementing this kernel.

My idea is that I’ll subclass Stationary, use its euclidean_dist and plug that into the formulas in the article.

Hmm, turns out I cannot simply use Stationary directly, because it uses the lengthscale in square_dist already.

I need to do something like

class PolarAngleChordal(pm.gp.cov.Stationary):
    """Kernel for polar angle coordinate using Wendland function of chordal distance

    Based on: Esperan Padonou, O Roustant. Polar Gaussian Processes for Predicting on Circular Domains.
2015.
    """

    def full(self, X, Xs=None):
        X, Xs = self._slice(X, Xs)
        # chordal distance, HOTFIX remove lengthscale here
        d1 = tt.sin(self.euclidean_dist(X, Xs)/2*self.ls)
        # Wendland function with c=2
        return (1 + self.ls * d1) * (1 - d1)**self.ls

Any suggestions how to remove the lengthscale from the distance calculation in a better way?

The chordal distance didn’t perform very well in comparison to the previous 3 compound kernels.

The Geodesic distance-based kernel works much better.

class PolarAngleGeodesic(pm.gp.cov.Stationary):
    """Kernel for polar angle coordinate using Wendland function of geodesic distance

    Based on: E Padonou, O Roustant. Polar Gaussian Processes for Predicting on Circular Domains.
2015.
    """

    def full(self, X, Xs=None):
        X, Xs = self._slice(X, Xs)
        # geodesic distance, HOTFIX remove lengthscale here
        d2 = tt.arccos(tt.cos(self.euclidean_dist(X, Xs)*self.ls))
        # Wendland function with c=pi
        return (1 + self.ls * d2/np.pi) * (1 - d2/np.pi)**self.ls

The call to arccos(cos(...)) seems wasteful though. Essentially it does something similar like the switch in the initial case, but in a “smoother” way I guess.

I’d still like to hear if anyone has suggestins how to improve the switch case.