Multi-Output Spatial Gaussian Process Prior Predictive Simulation

I have put together a spatial multi-output gaussian process and performed prior predictive simulations on it.
I am posting below for review (I have a few questions) and as a helper for other’s seeking to perform prior predictive checks for this kind of model (I couldn’t find a similar post when I started).

My primary concern is that something may be failing silently or that some functionality isn’t implemented correctly.

My code is based off these excellent posts:

My data comprises multivariate recordings on archaeological artefacts. I am seeking to model the spatial relationship as a multi-output gaussian process with intrinsic coregionalization. In my final model I will have 8 output parameters and a location for each observation as input.

My current model, with simulated input data is:

# Data structure of sample
n1, n2 = (50, 50)
x1 = np.linspace(0, 50, n1)
x2 = np.linspace(0, 50, n2)
param_idx = [0,1,2,3] # 4 parameters
result_id = np.array(param_idx*n1*n2)

Xs = pm.math.cartesian(x1[:, None], x2[:, None])
Xs = np.vstack((np.arange(0,Xs.shape[0]), Xs[:,0],Xs[:,1]))
Xs = np.vstack((np.repeat(Xs,len(param_idx),axis=1),result_id))# Total x input,input-idx x-coord,y-coord, result_id
Xs = Xs.T
Xdf =pd.DataFrame(Xs,columns=['sample_id','x coord','y coord','parameter id'])
Xtest = Xdf[['x coord','y coord','parameter id']].values

n_outputs = len(param_idx)
with pm.Model() as spatial_model:
    # GP
    ## Feature covariance

    length_scale = pm.InverseGamma("length_scale",alpha=4.5,beta=18)
    amplitude = pm.HalfCauchy("amplitude", beta = 10)
    cov_feature = amplitude ** 2 * pm.gp.cov.Matern52(
        input_dim = 3,
        ls = length_scale,
        active_dims = [0,1] # all except index
    )

    ## Coregion covariance
    W = pm.Normal( 
        "W", mu = 0, sigma = 5,
        shape = (n_outputs, n_outputs) # 4 outputs in final model
    )
    kappa = pm.HalfCauchy("kappa", beta = 10, shape = n_outputs)
    coreg = pm.gp.cov.Coregion(
        input_dim = Xtest.shape[1],
        active_dims = [2], # only index
        kappa = kappa,
        W = W
    )

    ## Combined covariance
    cov_f = coreg * cov_feature

    ## Gaussian process
    #gp = pm.gp.Marginal(cov_func = cov_f) 
    gp = pm.gp.Latent(cov_func = cov_f)

    ##Prior Predictive simulation
    gp_prior = gp.prior('gpprior',Xtest)
    f_star = gp.conditional('fstar',Xtest)
    prior_sample = pm.sample_prior_predictive(draws=1)

The following is then taken from the prior_sample to visualise the results on a “map”

ig, axs = plt.subplots(2, 2)  
prior_pred = prior_sampletest.prior.fstar.values[0,1,:]
pf = 0 #Parameter spatial function 

for i in range(2):
    for j in range(2):
        axs[i,j].scatter(Xtest[:,0][Xtest[:,2]==pf], #X_coords for function pf
            Xtest[:,1][Xtest[:,2]==pf], #Y_coords for function pf
            s=35, 
            c=prior_pred[Xtest[:,2]==pf],)cmap=cmap)
        axs[i,j].set_xlabel('X_coord')
        axs[i,j].set_ylabel('Y_coord')
        axs[i,j].set_title(f"Parameter Function: {pf}")
        axs[i,j].set_aspect('equal')
        pf += 1
ig.suptitle('Prior Predictive Simulations')

Which produces:
prior_preds_1

Which to my eyes look sufficiently random and spatially consistent. I would appreciate if anyone could have a look through the code to confirm the active dimensions and interactions are functioning correctly. That is, there is a unique gaussian process for each parameter/output, whose covariances are modified by a coregionalization kernel for the inter-parameter covariances.

a couple questions:

  • In my understanding, the W matrix, with kappa, is the covariance between parameters/outputs. The W matrix can apparently have a lower rank imposed on it, is there a principled way to do this? Or is it generally a computation saver. The baseball MOGP example imposes a (5_output,2) structure for the W matrix.
  • Sampling from this particular prior is quite slow, other than using the HSGP, is there an alternative to speed up sampling?
  • Prior sampling can produce a warning that the covariance is not symmetric positive semi-definite. Is this a concern?
    Thanks in advance.

@bwengals @lucianopaz

In my understanding, the W matrix, with kappa, is the covariance between parameters/outputs. The W matrix can apparently have a lower rank imposed on it, is there a principled way to do this? Or is it generally a computation saver. The baseball MOGP example imposes a (5_output,2) structure for the W matrix.

For some more info, check for example here starting at slide 134.

Sampling from this particular prior is quite slow, other than using the HSGP, is there an alternative to speed up sampling?

Other than HSGP your other main option is to exploit Kronecker structure if you can.

Prior sampling can produce a warning that the covariance is not symmetric positive semi-definite. Is this a concern?

I wouldn’t sweat it. It’s just a numerical stability issue, because your covariance is postive semi-definite. There is a jitter kwarg that you can increase from the default value of 1e-6 to fix this.

1 Like