Hi everyone,
I am going to incorporate the Multi-output Gaussian processes (MOGPs) supports for PyMC GP v4 module. Mostly focus on Coregion regression and Hadamard kernel.
I would like to ask everyone in our community for your opinions if you are working or worked on Multi-output GPs, both in academics and business contexts:
- What applications and/or problems are you apply the MOGPs for?
- What libraries/packages do you use for MOGPs, and on what kinds of datasets?
- What challenges do you face when implementing/applying MOGPs in your work?
Also, if you have any suggestions for implementing of MOGPs in PyMC v4, please let me know.
Many thanks!
6 Likes
I have been in meetings lately where colleagues are interested in jointly modelling the spatial variability of over one risk factor for cancer across a geographical map. The dataset has approximately 2000 spatial observations. CARBayes (an R package) supports "multivariate CAR " models. CAR models are sort of GPs, but the kernel is only non-zero if the two observations are spatial neighbours. This allows one to attempt either integrated nested Laplace approximations (INLA) or MCMC methods. In practice, I have found using any MCMC-style algorithm other than HMC means waiting forever to get any reasonable form of convergence.
Also have talked with people using gstats (another R package) for co-kriging, which is essentially the same as multi-output GPs (I think?). They were looking at modelling multiple mineral measurements at the same mining sites. Not too sure how they found it.
3 Likes
Hi @conorhassan
Thank you for sharing, it is interesting to know that MOGPs are applied in cancer research, and also in mining industry. I will check these R packages :slight_smile
Yes, Cokriging, which was come from Geostatistics, seems similar to a Linear Coregion Model. This lecture on multi-outputs GPs has mentioned about Cokriging.
Seem to me that many datasets in MOGPs have spatial/location characteristics.
Maybe a little late, but we’re modelling spectral data. We have many hundreds to several thousand spectal datasets of a few thousand points each and a corresponding composition for each spectrum, consisting of about a dozen different mineral species. Currently we’re using several single-output models, but would like to use a multiple-output model, as the input composition data are correlated.
1 Like