This seems like a good first step to me. I don’t have a lot of experience with GPs yet, but here are some comments:
- I don’t think you should use
observed:yobsshould probably be passed toy, since this kwarg takes the outcomes you observed and are evaluating the GP on. - In any case, I’m pretty sure you can’t use both
is_observedandobservedat the same time, since the former kwarg is used to setyas an (un)observed variable in the model. - I’m not surprised by the mass matrix error: I’m pretty sure your prior on
thetais not unformative enough, which is giving the sampler a hard time because the geometry it’s seeing is pretty flat. I’m guessing this problem is even more pronounced when you increase the data set – GPs are O(n3) complexity – hence the big sampling time. I’d advise doing prior predictive checks, so you can see the consequences of your choice ofthetapriors on the GP functions.
Gamma distributions are usually a good choice for priors on lengthscales, as they are very flexible, so you can adjust the amount of information. For more details, you can read the GP chapter in Osvaldo Martin’s book, as well as the Mauna Loa example on PyMC docs. Finally, for more information about the pathological behavior of GPs, you can read this series of articles.
Hope this helps 