Yep, different kernels can use different features. Use the input_dim and active_dims parameters of each kernel. This convention was shamelessly stolen from GPy and GPflow because it’s a good idea. input_dim will be equal to the number of columns of X, and active_dims is used to pick out which columns an individual kernel is applied to. So for your case, you’d write:
cov = nu**2 * pm.gp.cov.Matern32(input_dim=X.shape[1], active_dims=features1) * pm.gp.cov.ExpQuad(input_dim=X.shape[1], active_dims=features2)
Regarding your first question, there aren’t any built in kernels specifically for dealing with categorical inputs. One option is one-hot encoding your categories and using a standard ExpQuad. You can also define a custom kernel pretty easily (see here). Or you could use the Coregionalization kernel, possibly without modification.