Possible ways to speed up spatio-temporal GP modeling

Totally orthogonal to the discussion of GPs, but if the number of raster cells in your data is smaller than the number of timesteps, you could also consider writing the model as a statespace system. I’m thinking basically a diagonal VAR(p) with spatially correlated errors. This handles missing data very naturally, and might be a more convenient way to represent things. The wrinkle is that the models we have in PyMC at the moment are just slow for very large state vectors (and yours probably would be – in your case n \times p, where n is the number of rasters and p is the number of lags in the VAR).