Possible ways to speed up spatio-temporal GP modeling

If there is a simple toy example for this full grid + indexing, that would be extremely helpful.

Sure. Slight modification of this example. I removed all the comments except for what’s relevant for this discussion.

import arviz as az
import matplotlib as mpl
import numpy as np
import pymc as pm
plt = mpl.pyplot


RANDOM_SEED = 12345
rng = np.random.default_rng(RANDOM_SEED)

n1, n2 = (50, 30)
x1 = np.linspace(0, 5, n1)
x2 = np.linspace(0, 3, n2)

X = pm.math.cartesian(x1[:, None], x2[:, None])

l1_true = 0.8
l2_true = 1.0
eta_true = 1.0

cov = (
    eta_true**2
    * pm.gp.cov.Matern52(2, l1_true, active_dims=[0])
    * pm.gp.cov.Cosine(2, ls=l2_true, active_dims=[1])
)

K = cov(X).eval()
f_true = rng.multivariate_normal(np.zeros(X.shape[0]), K, 1).flatten()

sigma_true = 0.25
y_true = f_true + sigma_true * rng.standard_normal(X.shape[0])

# indices of the y data we're keeping
keep_percentage = 0.5
n_keep = int(np.round(keep_percentage * len(y_true)))
ix = np.arange(len(y_true))
observed_ix = np.random.choice(ix, size=n_keep, replace=False)
y = y_true[observed_ix] # now your observed data is a selection of y_true, and you know the indices


with pm.Model() as model:
    ls1 = pm.Gamma("ls1", alpha=2, beta=2)
    ls2 = pm.Gamma("ls2", alpha=2, beta=2)
    eta = pm.HalfNormal("eta", sigma=2)

    cov_x1 = pm.gp.cov.Matern52(1, ls=ls1)
    cov_x2 = eta**2 * pm.gp.cov.Cosine(1, ls=ls2)

    sigma = pm.HalfNormal("sigma", sigma=2)

    gp = pm.gp.LatentKron(cov_funcs=[cov_x1, cov_x2])
    f = gp.prior("f", Xs=Xs)

    # This line is the only one that's different.  Index the GP where you observed data
    f_obs = f[observed_ix]
    y_ = pm.Normal("y_", mu=f_obs, sigma=sigma, observed=y)

Now your trace of f will cover the full grid of your data. It will give you the same posterior of f as if you used the GP to “predict” what the f values would be there.

If you have 700 total data points, how big is your spatial grid, and how many points in time?

1 Like