Gaussian Process regression with categorical features

Nadheesh · June 12, 2019, 12:33pm

I have a case, where I’m trying to train a GP process regression model based continuous and categorical inputs. I read that traditional covariance functions that are used with Gaussian Process regression may not work well with categorical inputs.

I tried to understand the the covariance functions that we have with PyMC3, but cannot find whether we have any covariance function that works well with categorical inputs. I appreciate any help to identify a suitable covariance function to use with categorical inputs.

Edit:
Also is there a way to combine two kernels, yet use different features with each kernel.

As an example :

X1, X2 = X[features1], X[features2]
cov = nu ** 2 * pm.gp.cov.Matern32(X1.shape[1], l1) * pm.gp.cov.ExpQuad(X2.shape[1], l2)
gp = pm.gp.Marginal(cov_func=cov)

Notice that each kernel has different set of features. Can we do this? Then how do we pass data to each kernel?

Thanks a lot.

bwengals · June 13, 2019, 2:28am

Yep, different kernels can use different features. Use the input_dim and active_dims parameters of each kernel. This convention was shamelessly stolen from GPy and GPflow because it’s a good idea. input_dim will be equal to the number of columns of X, and active_dims is used to pick out which columns an individual kernel is applied to. So for your case, you’d write:

cov = nu**2 * pm.gp.cov.Matern32(input_dim=X.shape[1], active_dims=features1) * pm.gp.cov.ExpQuad(input_dim=X.shape[1], active_dims=features2)

Regarding your first question, there aren’t any built in kernels specifically for dealing with categorical inputs. One option is one-hot encoding your categories and using a standard ExpQuad. You can also define a custom kernel pretty easily (see here). Or you could use the Coregionalization kernel, possibly without modification.

Topic		Replies	Views
Gaussian Processes - combining constant features with covariance Questions	10	1263	December 21, 2017
Multidimensional input using Gaussian Process Questions	6	3994	June 28, 2017
Multi-output gaussian processes Questions	13	5435	October 22, 2017
New to Gaussian Processes question on single output and multiple inputs gaussian_process , modeling	3	685	June 17, 2023
Gaussian Processes: Sampling kernel hyperparameters? Questions	2	862	January 15, 2021

Gaussian Process regression with categorical features

Related topics