Graph-based covariance for GP

mochar · August 3, 2020, 3:24pm

Hi everyone. I have a spatial dataset with a set of measurements for each point in a grid layed over a map, something like the following:

Per grid point, I have many samples, each multi-dimensional.

I would like to find a function that maps each point in the grid to its measurement vector using gaussian processes. For the covariates I could simply use the (x, y) coordinates, however this would not take into account the shape of the map. (The region outside of the map cannot be “crossed”) Instead my idea was to calculate a kNN-graph on the coordinates of these points, and somehow use the shortest path between two nodes as a way to quantify covariance between two points.

The idea then would be to assign an index to each point, which I can model as a categorical distribution. Then I would use a pre-computed matrix of shortest paths between every pair of points in a custom covariance function. I have got the following at the moment:

X = # n x 1 index vector for each sample
P = # k x k matrix of shortest paths between any two points in grid
Y = # n x p matrix of measurements 

class ShortestPath(pm.gp.cov.Stationary):
    def __init__(self, P, ls=None, ls_inv=None, active_dims=None):
        super().__init__(1, ls, ls_inv, active_dims)
        self.P = tt.as_tensor_variable(P)

    def full(self, X, Xs=None):
        X, Xs = self._slice(X, Xs)
        index = tt.cast(X[:, 0], 'int32')
        if Xs is None:
            index2 = index.T
        else:
            index2 = tt.cast(Xs[:, 0], 'int32').T
        return self.P[index, index2]

 with pm.Model() as model:
    p = pm.Dirichlet('p', 0.1 * np.ones((n, k)), shape=(n, k))
    x = pm.Categorical('x', p, shape=n, observed=X)
    xs = tt.reshape(x, (n, 1))
    
    for i in range(p):
        cov_func = ShortestPath(P, ls=1)
        gp = pm.gp.Marginal(cov_func=cov_func)
        y = gp.marginal_likelihood(f'y{i}', X=xs, y=Y[:, i], noise=0.01)

Do I have the right idea here?

ckrapu · August 3, 2020, 3:36pm

If you have a graph, you can use a conditional autoregression to allow for covariance between connected nodes. The CAR will not allow correlation to cross the missing regions of the spatial area (e.g. the concavities in the boundary of the shape you’ve shown) unlike a standard GP and will force correlations to travel along the graph’s edges. Even though it is specified between pairs of sites, it allows for long range correlations between nodes.

However, I’m a little unsure if this is precisely what you were asking for. You are assigning an index to each point which is latent and therefore must be modeled. Are you directly interested in the value of that index or do you just regard it as a device to inform a graph covariance structure?

mochar · August 3, 2020, 3:43pm

Thanks for the reply. I am actually interested in the value of the index (or rather the probabilities over the indices). Actually later on I would like to include data points with missing locations to infer their spatial location using their measurements. So a kind of semi supervised approach.

Topic		Replies	Views
Multivariate Gaussian Processes Questions	2	564	July 27, 2020
Gaussian Processes for spatio-temporal raster data Questions	2	483	May 8, 2023
Using a GP as Covariance Matrix for an MvNormal Questions	4	755	June 15, 2020
Simple GP model giving large MAP values for covariance timescale Questions	9	581	August 6, 2021
Sparse GP with linear covariate? Questions	2	600	August 3, 2021

Graph-based covariance for GP

Related topics