Avoiding looping when using GP prior on latent variables

Jamal_Bajwa · March 29, 2022, 6:30pm

Hello PyMC devs and users,

I’m just getting more acquainted with PyMC3, so apologies if this is a silly issue.

I am working on a model whose essence can be described by the following model:

y[i,w] ~ Poisson(lam[i, w]) (Likelihood)
lam[i,w] = exp(gamma[i,w]) 
gamma[i, w] ~ GP(0, K(w, w'; params)) for each i, where K has size [w by w]

i can be interpreted as individuals, i \in {1, 2, ..., I}
w can be interpreted as weeks, w \in {1, 2, ..., W}

In other words, I have a inhomogenous poisson process, where the arrival rate term is a Gaussian Process. For each individual i, I want the same Gaussian Process to describe the joint density of gamma[i, :].

In the model context, I have written this using a for loop as follows:

cov_gamma = gamma_sigma * pm.gp.cov.ExpQuad(input_dim=1, ls=1) 
gamma_gp = pm.gp.Latent(cov_func=cov_gamma)

#==== Loop over each individual to extract a draw (into an IxW matrix)

gamma_iw = tt.zeros(shape=(len(I), len(W))) 

for i in range(0, len(I)):
    gamma = gamma_gp.prior(f"gamma_{i}", X=W)        # W is vector of weeks
    tt.subtensor.set_subtensor(gamma_iw[i, :], gamma)

I believe this is correct. But compilation of this model is extremely slow as “I” increases. Seeing as the GP is compiled to pm.MultivariateNormal(), where we can adjust the shape, is there a more straightforward to specify this instead of the for loop?

I hope my question was clear, but I’m happy to clarify further if needed!

ckrapu · March 29, 2022, 9:58pm

Try this:

gamma = gamma_gp.prior(f"gamma_{i}", X=W)
gamma_iw = tt.repeat(gamma[:, None], I, axis=0)

This just adds another dimension and then repeats over that dimension to give you I identical copies of gamma in an array with shape (I, W).

Jamal_Bajwa · March 29, 2022, 11:03pm

Hm, just tried it. Getting the following error related to tt.repeat call:

File ~/.conda/envs/routines/lib/python3.9/site-packages/theano/tensor/extra_ops.py:786, in repeat(x, repeats, axis)
    783 repeats = basic.as_tensor_variable(repeats)
    785 if repeats.ndim > 1:
--> 786     raise ValueError("The dimension of repeats should not exceed 1.")
    788 if repeats.ndim == 1 and not repeats.broadcastable[0]:
    789     return RepeatOp(axis=axis)(x, repeats)

ValueError: The dimension of repeats should not exceed 1.

Jamal_Bajwa · March 29, 2022, 11:21pm

@ckrapu the basic idea of your code is correct. I was a vector in my code, but I’m sure you meant len(I) as nrepeats, so I should’ve caught that. So I think the following works:

gamma = gamma_gp.prior(f"gamma_{i}", X=W)
gamma_iw = tt.repeat(gamma[:, None], len(I), axis=1)

But one small point: gamma_iw is a (W, I) matrix because I set axis=1. If axis=0, I get a (I*W, 1) matrix. axis=0 does not quite do what you expected from the repeats operation? I might just have to transpose, which isn’t an issue.

ckrapu · March 29, 2022, 11:24pm

Ah, my mistake then. You could also do

gamma_iw = tt.repeat(gamma[None ,:], len(I), axis=0)

and that should give you the (I,W) matrix. The trick here is to create the extra dimension with length 1 using the None slice.

Jamal_Bajwa · March 30, 2022, 2:27am

@ckrapu Just to make sure I understand this…

My initial specification was:

gamma[0, :] ~ GP(0, K; params)
gamma[1, :] ~ GP(0, K; params)
etc.

To be precise, suppose gamma[0, :] are 52 RVs indexed gamma[0, 0], gamma[0, 1], …, gamma[0, 52]. I want the the joint density of gamma[0, 0], gamma[0, 1], …, gamma[0, 52] to be a GP. Also, I want the joint density of gamma[1, 0], gamma[1, 1], …, gamma[1, 52] to be a GP (the same GP as gamma[0, :]). Therefore, set of RVs {gamma[0, :]} and {gamma[1, :]} have identical joint distributions. However, these are different random variables, in the sense that a forward simulation draw might lead to different values for each set. Does that make sense?

I fear that using tt.repeat, we essentially repeat the gamma[0, :] to be gamma[1, :] and so on, meaning that we don’t explicitly state that the random variables are different. The for loop in my original post, I think, ensures they are different by virtue of creating a new set of RVs in each iteration. So I’m not sure tt.repeat is the solution here?

ckrapu · March 30, 2022, 4:04am

You’re right, that’s quite different than what I had suggested. Perhaps you could consider using the LatentKron per your previous GP question and using an identity matrix as one of the Kronecker product matrices. Then, you’d reshape an I*W GP draw into and I,W matrix. That way, all of the individual vectors will share the same GP prior but will be independent.

Jamal_Bajwa · March 30, 2022, 1:56pm

That’s an interesting idea @ckrapu. I will explore it further.

Here’s another idea, but one that takes us out of using the nice aspects pm.gp.

with pm.Model() as base_model:
    
    cov_gamma = 1.0 * pm.gp.cov.ExpQuad(input_dim=1, ls=1.0) + pm.gp.cov.WhiteNoise(1e-6)
    cov_gamma_eval = cov_gamma(W) # W is a theano.tensor containing input W of size (52, )
    gamma_all = pm.MvNormal("gamma_all", mu=np.zeros(len(W)), cov=cov_mu_cov, shape=(len(I), len(W)))

In this scenario, I use the shape argument of the MvNormal to generate len(I) samples of size len(W) (Multivariate Normal density is on a random vector of size len(W)).

Thoughts on this?

Jamal_Bajwa · March 30, 2022, 6:45pm

@ckrapu I used your idea pm.gp.LatentKron idea and wrote out the following. It seems to work, compiles and samples pretty fast - very clever trick! Leaving the code below for people to track the solution. FixedMatrixCovariance just returns a fixed identity matrix.

I was able to couple the trick where the covariance matrix is a kronecker product (naturally extends to that case).

class FixedMatrixCovariance(pm.gp.cov.Covariance):
    r"""
    Fixed matrix covariance function.

    .. math::

       k(x, x') = omega(i, j)
    """

    def __init__(self, omega):
        super().__init__(1, None)
        self.omega = omega

    def diag(self, X):
        return at.diag(self.omega)

    def full(self, X, Xs=None):
        if Xs is None:
            return self.omega
        else:
            return self.omega

with pm.Model() as gamma_gp:
    
    #== gamma(i, w) -> size = (I, 52)
    
    gamma_sigma = pm.Gamma("gamma_sigma", 1, 0.01) 
    gamma_rho = 3.    
    cov_gamma = gamma_sigma * pm.gp.cov.ExpQuad(input_dim=1, ls=gamma_rho) 
    cov_gamma += pm.gp.cov.WhiteNoise(1e-6)
    
    # pm.gp.LatentKron idea
    
    identity_matrix = tt.eye(len(I))
    identity_cov = FixedMatrixCovariance(identity_matrix)
    gamma_gp = pm.gp.LatentKron(cov_funcs=[identity_cov, cov_gamma])
    gamma = gamma_gp.prior(f"gamma", Xs=I_W)
    gamma = gamma.reshape((len(W), len(I))).T
    trace = pm.sample(cores=4)

Topic		Replies	Views
Multiple Gaussian processes without for loop? Questions gaussian_process	6	670	January 19, 2022
Generalised Wishart Processes - stuck with accessing variables in a loop Questions	0	467	August 15, 2019
Using Gaussian Process model to make inference? v5 gaussian_process	1	660	February 5, 2023
Gaussian Process -Statistical rethinking v5 gaussian_process	2	836	January 28, 2023
Using a GP as Covariance Matrix for an MvNormal Questions	4	755	June 15, 2020

Avoiding looping when using GP prior on latent variables

Related topics