Hi everyone. I have a spatial dataset with a set of measurements for each point in a grid layed over a map, something like the following:
Per grid point, I have many samples, each multi-dimensional.
I would like to find a function that maps each point in the grid to its measurement vector using gaussian processes. For the covariates I could simply use the (x, y) coordinates, however this would not take into account the shape of the map. (The region outside of the map cannot be “crossed”) Instead my idea was to calculate a kNN-graph on the coordinates of these points, and somehow use the shortest path between two nodes as a way to quantify covariance between two points.
The idea then would be to assign an index to each point, which I can model as a categorical distribution. Then I would use a pre-computed matrix of shortest paths between every pair of points in a custom covariance function. I have got the following at the moment:
X = # n x 1 index vector for each sample
P = # k x k matrix of shortest paths between any two points in grid
Y = # n x p matrix of measurements
class ShortestPath(pm.gp.cov.Stationary):
def __init__(self, P, ls=None, ls_inv=None, active_dims=None):
super().__init__(1, ls, ls_inv, active_dims)
self.P = tt.as_tensor_variable(P)
def full(self, X, Xs=None):
X, Xs = self._slice(X, Xs)
index = tt.cast(X[:, 0], 'int32')
if Xs is None:
index2 = index.T
else:
index2 = tt.cast(Xs[:, 0], 'int32').T
return self.P[index, index2]
with pm.Model() as model:
p = pm.Dirichlet('p', 0.1 * np.ones((n, k)), shape=(n, k))
x = pm.Categorical('x', p, shape=n, observed=X)
xs = tt.reshape(x, (n, 1))
for i in range(p):
cov_func = ShortestPath(P, ls=1)
gp = pm.gp.Marginal(cov_func=cov_func)
y = gp.marginal_likelihood(f'y{i}', X=xs, y=Y[:, i], noise=0.01)
Do I have the right idea here?