How to build a Multi-Output Gaussian Process for a Surrogate Model

I am currently using PyMC to do Bayesian inference on a set of fitting parameters for a simulation, which has a vector response. I use Scikit-Learn to build the Gaussian process (GP) surrogate model and the black-box likelihood in PyMC to determine the parameter uncertainty on a piece of experimental data.

In the Scikit-Learn GP, there is no explicit multi-output kernel and so each output dimension is a seperately trained GP, however I would obviously prefer a more robust multi-output kernel in order to get better predictions.

I would like to convert my existing method to be wholly within PyMC, however the examples for multi-output GPs are a bit confusing and cannot tell if they are suited to my problem, which if treated like a black-box is essentailly a surrogate model which takes 6 input parameters and predicts a 100 dimesional output vector.

If anyone has done anything similiar or explain how to adapt any existing methods that would be great!

With that kind of output dimensionality-I’m not sure if you’ll get anything in a reasonable amount of time. GP’s are powerful-but computationally expensive. The number of inversions you need to perform to produce the posterior scales very quickly in the number of dimensions-and posterior uncertainty around the n-th vector will similarly scale very quickly.

Here’s a great intro lecture: https://www.youtube.com/watch?v=ttgUJtVJthA

Example notebook where you can find more links: Multi-output Gaussian Processes: Coregionalization models using Hamadard product — PyMC example gallery