Avoiding looping when using GP prior on latent variables

Hello PyMC devs and users,

I’m just getting more acquainted with PyMC3, so apologies if this is a silly issue.

I am working on a model whose essence can be described by the following model:

y[i,w] ~ Poisson(lam[i, w]) (Likelihood)
lam[i,w] = exp(gamma[i,w]) 
gamma[i, w] ~ GP(0, K(w, w'; params)) for each i, where K has size [w by w]

i can be interpreted as individuals, i \in {1, 2, ..., I}
w can be interpreted as weeks, w \in {1, 2, ..., W}

In other words, I have a inhomogenous poisson process, where the arrival rate term is a Gaussian Process. For each individual i, I want the same Gaussian Process to describe the joint density of gamma[i, :].

In the model context, I have written this using a for loop as follows:

cov_gamma = gamma_sigma * pm.gp.cov.ExpQuad(input_dim=1, ls=1) 
gamma_gp = pm.gp.Latent(cov_func=cov_gamma)

#==== Loop over each individual to extract a draw (into an IxW matrix)

gamma_iw = tt.zeros(shape=(len(I), len(W))) 

for i in range(0, len(I)):
    gamma = gamma_gp.prior(f"gamma_{i}", X=W)        # W is vector of weeks
    tt.subtensor.set_subtensor(gamma_iw[i, :], gamma)

I believe this is correct. But compilation of this model is extremely slow as “I” increases. Seeing as the GP is compiled to pm.MultivariateNormal(), where we can adjust the shape, is there a more straightforward to specify this instead of the for loop?

I hope my question was clear, but I’m happy to clarify further if needed!

Try this:

gamma = gamma_gp.prior(f"gamma_{i}", X=W)
gamma_iw = tt.repeat(gamma[:, None], I, axis=0)

This just adds another dimension and then repeats over that dimension to give you I identical copies of gamma in an array with shape (I, W).

1 Like

Hm, just tried it. Getting the following error related to tt.repeat call:

File ~/.conda/envs/routines/lib/python3.9/site-packages/theano/tensor/extra_ops.py:786, in repeat(x, repeats, axis)
    783 repeats = basic.as_tensor_variable(repeats)
    785 if repeats.ndim > 1:
--> 786     raise ValueError("The dimension of repeats should not exceed 1.")
    788 if repeats.ndim == 1 and not repeats.broadcastable[0]:
    789     return RepeatOp(axis=axis)(x, repeats)

ValueError: The dimension of repeats should not exceed 1.

@ckrapu the basic idea of your code is correct. I was a vector in my code, but I’m sure you meant len(I) as nrepeats, so I should’ve caught that. So I think the following works:

gamma = gamma_gp.prior(f"gamma_{i}", X=W)
gamma_iw = tt.repeat(gamma[:, None], len(I), axis=1)

But one small point: gamma_iw is a (W, I) matrix because I set axis=1. If axis=0, I get a (I*W, 1) matrix. axis=0 does not quite do what you expected from the repeats operation? I might just have to transpose, which isn’t an issue.

Ah, my mistake then. You could also do

gamma_iw = tt.repeat(gamma[None ,:], len(I), axis=0) 

and that should give you the (I,W) matrix. The trick here is to create the extra dimension with length 1 using the None slice.

@ckrapu Just to make sure I understand this…

My initial specification was:

gamma[0, :] ~ GP(0, K; params)
gamma[1, :] ~ GP(0, K; params)
etc.

To be precise, suppose gamma[0, :] are 52 RVs indexed gamma[0, 0], gamma[0, 1], …, gamma[0, 52]. I want the the joint density of gamma[0, 0], gamma[0, 1], …, gamma[0, 52] to be a GP. Also, I want the joint density of gamma[1, 0], gamma[1, 1], …, gamma[1, 52] to be a GP (the same GP as gamma[0, :]). Therefore, set of RVs {gamma[0, :]} and {gamma[1, :]} have identical joint distributions. However, these are different random variables, in the sense that a forward simulation draw might lead to different values for each set. Does that make sense?

I fear that using tt.repeat, we essentially repeat the gamma[0, :] to be gamma[1, :] and so on, meaning that we don’t explicitly state that the random variables are different. The for loop in my original post, I think, ensures they are different by virtue of creating a new set of RVs in each iteration. So I’m not sure tt.repeat is the solution here?

You’re right, that’s quite different than what I had suggested. Perhaps you could consider using the LatentKron per your previous GP question and using an identity matrix as one of the Kronecker product matrices. Then, you’d reshape an I*W GP draw into and I,W matrix. That way, all of the individual vectors will share the same GP prior but will be independent.

That’s an interesting idea @ckrapu. I will explore it further.

Here’s another idea, but one that takes us out of using the nice aspects pm.gp.

with pm.Model() as base_model:
    
    cov_gamma = 1.0 * pm.gp.cov.ExpQuad(input_dim=1, ls=1.0) + pm.gp.cov.WhiteNoise(1e-6)
    cov_gamma_eval = cov_gamma(W) # W is a theano.tensor containing input W of size (52, )
    gamma_all = pm.MvNormal("gamma_all", mu=np.zeros(len(W)), cov=cov_mu_cov, shape=(len(I), len(W)))

In this scenario, I use the shape argument of the MvNormal to generate len(I) samples of size len(W) (Multivariate Normal density is on a random vector of size len(W)).

Thoughts on this?

@ckrapu I used your idea pm.gp.LatentKron idea and wrote out the following. It seems to work, compiles and samples pretty fast - very clever trick! Leaving the code below for people to track the solution. FixedMatrixCovariance just returns a fixed identity matrix.

I was able to couple the trick where the covariance matrix is a kronecker product (naturally extends to that case).

class FixedMatrixCovariance(pm.gp.cov.Covariance):
    r"""
    Fixed matrix covariance function.

    .. math::

       k(x, x') = omega(i, j)
    """

    def __init__(self, omega):
        super().__init__(1, None)
        self.omega = omega

    def diag(self, X):
        return at.diag(self.omega)

    def full(self, X, Xs=None):
        if Xs is None:
            return self.omega
        else:
            return self.omega

with pm.Model() as gamma_gp:
    
    #== gamma(i, w) -> size = (I, 52)
    
    gamma_sigma = pm.Gamma("gamma_sigma", 1, 0.01) 
    gamma_rho = 3.    
    cov_gamma = gamma_sigma * pm.gp.cov.ExpQuad(input_dim=1, ls=gamma_rho) 
    cov_gamma += pm.gp.cov.WhiteNoise(1e-6)
    
    # pm.gp.LatentKron idea
    
    identity_matrix = tt.eye(len(I))
    identity_cov = FixedMatrixCovariance(identity_matrix)
    gamma_gp = pm.gp.LatentKron(cov_funcs=[identity_cov, cov_gamma])
    gamma = gamma_gp.prior(f"gamma", Xs=I_W)
    gamma = gamma.reshape((len(W), len(I))).T
    trace = pm.sample(cores=4)