Yeah, definitely look into the Sparse GP implementations, they should help a lot with speed.
According to your suggestion, it is equivalent to I established a gp model for each measuring point, a total of 400 gp models.
Only sort of. My suggestion is to go one of two directions, depending on whether you care about each sensor individually (that is, you are actually interested in comparing how sensor 1 behaves at wind speed
[13.83, 0.43] vs
[10.8, 0.23], and separately how sensor 2 behaves at various wind speeds) or if you only care about them collectively, i.e. you just want to know the mean and standard deviation of the sensor readings. The second case is easier, and seems most similar to your model. This is really just one GP, not 400, but with repeated observations:
# Tile and reshape your input matrix, (88,2) -> (35200,2)
X_tiled = np.tile(X,[1,1,2]).reshape([-1,2])
# Transform and flatten observation matrix, (88,400) -> (35200,)
# Note that if your observations are strictly positive, you should probably
# transform them from (0, +inf) to (-inf, +inf) via a log link. Of course,
# if an observation of 0 is possible, a different transformation function
# or a different likelihood model (via a `Latent` GP) would be better.
lg_y_flat = np.log(y.flatten())
with pm.Model() as gp_model:
ls = pm.Normal('ls',mu = 0.2, sd = 1)
# Is there a reason you didn't include scaling?
η= pm.Gamma('η', alpha=2, beta=1)
cov = η**2 * pm.gp.cov.ExpQuad(2, ls = ls)
gp = pm.gp.Marginal(cov_func=cov)
eps = pm.Exponential('eps', 1) # Noise should be strictly positive
y_pred = gp.marginal_likelihood('y_pred', X = X_tiled, y = lg_y_flat, noise = eps)
# I'd strongly recommend starting with the MAP, at least for prototyping the model
mp = pm.find_MAP()
Alternatively, if you do care about individual sensor behavior, I’d encode the x/y/z position of each sensor in the input data. That means you now have 5 input dimensions (two related to wind speed and three related to position) which will allow the model to capture the spatial correlations between sensor readings. You may want to different covariance kernels for the spatial vs wind speed inputs.
Additional modeling tips
A couple extra thoughts come to mind.
You say X is related to wind speed, might I hazard a guess that these are vectors of speed and angle? If that’s the case, you may want to consider a Circular Kernel that will likely capture angular continuity better.
Particularly at high dimensions, it will be very important to have reasonable priors on the hyperparameters. I’d check out Michael Betancourt’s blog post on the topic and my answer here for how I implement his suggestions in python.