I have also recently been struggling with this. What I wound up doing, and I don’t know if this is at all the correct/best way to do it, is make a custom covariance function. However I am still struggling on the custom mean. Here is what I have for the covariance:
class IdCov(pm.gp.cov.Covariance):
def __init__(self, input_dim, active_dims=None, sigma=1.0):
super(IdCov, self).__init__(input_dim=input_dim, active_dims=active_dims)
self.sigma = sigma
def diag(self, X):
return tt.alloc(tt.square(self.sigma), X.shape[0])
def _dist(self, X, Xs):
return tt.cast(tt.eq(tt.reshape(X, (-1, 1)), tt.reshape(Xs, (1, -1))), 'int8')
def full(self, X, Xs=None):
X, Xs = self._slice(X, Xs)
if Xs is None:
return tt.alloc(self.sigma * self._dist(X, X), X.shape[0], X.shape[0])
else:
return tt.alloc(self.sigma * self._dist(X, Xs), X.shape[0], Xs.shape[0])
You can take your two-dimensional data, where dimension 1 is a continuous feature and dimension 2 is the ID of your groups, and create a combined covariance kernel:
x1, x2 = np.meshgrid(np.linspace(0,1,10), np.arange(1,4))
X = np.concatenate((x1.reshape((30,1)), x2.reshape((30,1))), axis=1)
cov1 = pm.gp.cov.ExpQuad(2, 0.5, active_dims=[0])
cov2 = IdCov(2, sigma=1.0, active_dims=[1])
cov = cov1 * cov2
K = function([], cov(X))()
m = plt.imshow(K, cmap="inferno", interpolation='none'); plt.colorbar(m);
However, I am struggling to develop a mean function that allows you to learn a separate mean for each group. If someone with more PyMC3 expertise could weigh in, that would be great!