Hierarchical Time Series w/Gaussian Process: Any examples?

I have a panel dataset where each row of data is a customer X timeperiod combination. A continuous performance metric is recorded for each row. I also have a bunch of customer-level attributes that are time-invariant.

I’m attempting to model the evolution of performance metric as a 2-level hierarchical gaussian process. In this model, each customer’s timeseries has a GP fit to their data, but the parameters governing the covariance function of the GP are generated from a customer-level model.

I’m fairly new to pymc3 and am struggling to piece this together for example code in the docs. Was wondering if anyone had attempted a model like this before and can refer papers/blogs/insights.

Thanks

Have you check out the discussion in this post? Multiple (uncertain) function observations of the same Gaussian process

I have also recently been struggling with this. What I wound up doing, and I don’t know if this is at all the correct/best way to do it, is make a custom covariance function. However I am still struggling on the custom mean. Here is what I have for the covariance:

class IdCov(pm.gp.cov.Covariance):
    def __init__(self, input_dim, active_dims=None, sigma=1.0):
      super(IdCov, self).__init__(input_dim=input_dim, active_dims=active_dims)
      self.sigma = sigma

    def diag(self, X):
      return tt.alloc(tt.square(self.sigma), X.shape[0])

    def _dist(self, X, Xs):
      return tt.cast(tt.eq(tt.reshape(X, (-1, 1)), tt.reshape(Xs, (1, -1))), 'int8')

    def full(self, X, Xs=None):
      X, Xs = self._slice(X, Xs)
      if Xs is None:
        return tt.alloc(self.sigma * self._dist(X, X), X.shape[0], X.shape[0])
      else:
        return tt.alloc(self.sigma * self._dist(X, Xs), X.shape[0], Xs.shape[0])

You can take your two-dimensional data, where dimension 1 is a continuous feature and dimension 2 is the ID of your groups, and create a combined covariance kernel:

x1, x2 = np.meshgrid(np.linspace(0,1,10), np.arange(1,4))
X = np.concatenate((x1.reshape((30,1)), x2.reshape((30,1))), axis=1)

cov1 = pm.gp.cov.ExpQuad(2, 0.5, active_dims=[0])
cov2 = IdCov(2, sigma=1.0, active_dims=[1])
cov = cov1 * cov2

K = function([], cov(X))()
m = plt.imshow(K, cmap="inferno", interpolation='none'); plt.colorbar(m);

block_diag_kernel

However, I am struggling to develop a mean function that allows you to learn a separate mean for each group. If someone with more PyMC3 expertise could weigh in, that would be great!

Here’s an example in stan that would be amazing to see in pymc3:
http://mc-stan.org/events/stancon2017-notebooks/stancon2017-trangucci-hierarchical-gps.pdf

it’s a bit beyond my skill level in pymc3, any help would be great. if i do make any progress I’ll share here.

1 Like