If I understand the problem correctly (a shakey assumption at times), then:
timeD can be quickly computed with broadcast subtraction: timeD = T[:, None] - T[None].
For spaceD, you can use scipy.spatial.distance.pdist to compute the euclidean distance between each point, then normalize according to the formula in the appendix:
xy_coords = xyt[['x', 'y']].values
spaceD = np.exp(pdist(xy_coords, 'euclidian') / 2 / sigma ** 2) / 2 / np.pi / sigma
So that will give you two N \times N matrices with all the preliminaries you need. As an aside, they subtract T.min() from T as a normalization, so that the first time in the dataset is 0. It keeps you from having underflows when you exponentiate. Personally I’d even go one step further and normalize it so the dates are all on the 0-1 interval (e.g. with sklearn.preprocessing.MinMaxScaler), but that’s a personal preference.
Next you need to do the ll computation. For this I like to use masks. I’d start by just naively computing everything, as in:
ll = a * lengthscaleT * exp(timeD * lengthscaleT) * spaceD
This ll holds the element-wise computation for every possible (i,j) combination. The trick now is only to add up certain elements. It turns out to be the lower triangle of this matrix. You can convince yourself of this by running the following code (which is the loop in the stan code, remembering that R starts indices from 1):
n = 10
mask = np.zeros((n , n))
for i in range(1,n):
for j in range(0, i):
mask[i,j] = 1
mask
>> Out: array([[0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[1., 1., 0., 0., 0.],
[1., 1., 1., 0., 0.],
[1., 1., 1., 1., 0.]])
Anyway there’s an easier way to make this matrix than doing that loop:
addition_mask = np.zeros_like(timeD)
addition_mask[np.tril_indices_from(addition_mask, k=-1)] = 1
So we multiply ll by this mask then sum away the 2nd dimension: ll = (ll * addition_mask).sum(axis=-1) and we’re done.
Note that you could also use this masking method to do the summation even if the rows are not sorted by time, you’d just have to be a bit more clever about how to construct the mask.
For actually making the PyMC model, I’d make the addition mask and the distance matrix in numpy then call pt.as_tensor_variable on the results, otherwise things should map over 1:1. You could even use a guassian covariance kernel from pm.gp.cov instead of pdist if you wanted to be fancy.