gp.Latent() different shape / vectorize without loop

I’m trying to draw a GP sample for each of B with different hyperparameters. My code simplified would look something like this

# draw hyperparams with size B-1
mean = pm.Normal('mean', mu = 0, sigma = 1, shape = B-1)
ls = pm.Gamma('ls', alpha = 1, beta = 0.5, shape = B-1)

# code to generate the GP
cov =, ls_inv= ls) +
gp = =, cov_func = cov)

# draw
sample = gp.prior("sample", X = np.arange(1,P+1)[:,None])

Everything up to the draw block seems to be vectorized correctly, but the last line throws a dimensionality error because of the definition of X. FYI the code works if the shape for each variable is just 1.

I have written a working version using list comprehension, but I hope to find a vectorized solution to speed up the process. I’ll attach the list comprehension version here.

mean = pm.Normal('mean', mu = 0, sigma = 1, shape = B-1)
ls = pm.Gamma('ls', alpha = 1, beta = 0.5, shape = B-1)

cov = [, ls_inv= ls[i]) + for i in range(B-1)]
gps = [ =[i]), cov_func = cov[i]) for i in range(B-1)]
sample = at.stack([gp.prior("sample" + str(i), X = np.arange(1,P+1)[:,None]) for i, gp in enumerate(gps)], axis = 1)

Thanks for taking a look!

One quick thing to note, is it’s probably a bit clearer to use the jitter argument of gp.Latent instead of an actual WhiteNoise kernel. The reason is because I’m assuming you’re adding a bit on the diagonal for numerical reasons, and not that you’re adding an actual white noise process to your model. The effect is the same either way though. Also, watch the scaling factor, eta**2 *, you’ll usually want that.

Since you’re using the same X for each GP, you can definitely vectorize this better without at.stack. Unfortunately, this isn’t currently built-in to PyMC gp.Latent class (though @danhphan is working on adding multi-output GPs now!). Though it’s for gp.Marginal, this gist shows how you can code this up. It’ll need some small changes to adapt it for gp.Latent, but reach out if you hit a snag.

In case this isn’t the answer you need, would it be possible to provide some fake data and a bit more code so I can run it and see the same issue? Is B = 1 in the first example, and what’s P?


Thanks for the helpful notes. I have changed to using a jitter argument instead & added the amplitude term.

I ended up having to change the original problem statement to one where N individuals share the same covariance matrix under a GP process across P peiords. I was able to use LatentKron to achieve this, similar to what was suggested here. But many thanks for the article anyways, since I might run into this issue in the near future.

1 Like