Interpretation of posterior predictive checks for a Gaussian Process

Thanks for clarifying your comment, Jesse!

I’ve cleaned up my code, which I started using to learn PyMC, and am attaching it here with the input data. I tested it on my Mac and Colab, so hopefully, it should work without issues.

A few notes:

  • I use the Latent model as a default (i.e., MODEL = 'GP_Latent_TS') based on the comments I received here.
  • I just used a few hundred samples for testing purposes in this slow model – this needs to be adjusted.
  • I have two options for priors for my state vector “Lambda”: truncated normal and lognormal. I find that the lognormal prior yields an unrealistic posterior value, likely due to the long-tailed distribution. But I don’t know how to set the bounds for lognormal. If you have any ideas, please let me know.
  • I added some plots; I guess they will show what I am trying to see as the final result, which is the median Lambda values and their uncertainty.
  • Given the inherent error in my physical model in producing the K matrix, GP may not capture all of the observations.
  • If you see any mistakes or have any comments or suggestions, your help would be appreciated.

I’m still not sure about the posterior predictive sampling (i.e., sample_posterior_predictive) for my positive observation Y, although I may not need it. Then what would be used for model checking?

I couldn’t attach my Jupyter Notebook file (not allowed?), so I converted it to a Python script. On Visual Studio Code, I could run the whole script after pasting it into a single Jupyter Notebook cell. I guess you know it better than I do …

Again, thank you so much for your help!

– Seongeun

GP.py (10.4 KB)

Dmat.csv (589.5 KB)
Y.csv (3.1 KB)
K.csv (908.5 KB)