Pointwise Log Likelihood For Gaussian Processes

rob00 · February 6, 2025, 5:52pm

First of all, thanks for the quick replies.
I tested some of your suggestions.

resample your model using pm.gp.Latent with separate Normal likelihood. The sum of the log-likelihoods calculated this way will equal the log-likelihood from pm.gp.Marginal.

Using the model from my first message i have written its Latent version (which calculate the correct elementwise log likelihood) but the sum of the likelihoods do not match. I have -3.37e+05 for the Latent model and -4.39e+05 for the Marginal model.I have set the random seed as suggested here.
The implementation of the Latent model is the following:

rng = np.random.default_rng(42)

with pm.Model() as model:
    ell = pm.Gamma("ell", alpha=2, beta=1)
    eta = pm.HalfCauchy("eta", beta=5)

    cov = eta**2 * pm.gp.cov.Matern52(1, ell)
    gp = pm.gp.Latent(cov_func=cov)

    f = gp.prior("f", X=X)
    
    sigma = pm.HalfCauchy("sigma", beta=5)
    y_ = pm.Normal('y', f, sigma, observed=y)

    latent_post = pm.sample(nuts_sampler="pymc", idata_kwargs={"log_likelihood": True}, random_seed=rng)

# i calculated the sum in this way
print(latent_post.log_likelihood.sum()) 
# compare it to the marginal model
print(marginal_post.log_likelihood.sum())

inside that context call pm.compute_log_likelihood, passing in the idata from the marginalized version.

As suggested i did the following:

with pm.Model() as model:
    ell = pm.Gamma("ell", alpha=2, beta=1)
    eta = pm.HalfCauchy("eta", beta=5)

    cov = eta**2 * pm.gp.cov.Matern52(1, ell)
    gp = pm.gp.Latent(cov_func=cov)

    f = gp.prior("f", X=X)
    
    sigma = pm.HalfCauchy("sigma", beta=5)
    y_ = pm.Normal('y', f, sigma, observed=y)

    latent_post = pm.compute_log_likelihood(marginal_post)

and i have the following error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/opt/conda/envs/dev/lib/python3.12/site-packages/xarray/core/dataset.py in ?(self, names)
   1474                 variables[name] = self._variables[name]
   1475             except KeyError:
-> 1476                 ref_name, var_name, var = _get_virtual_variable(
   1477                     self._variables, name, self.sizes

KeyError: 'f_rotated_'

The only method that worked so far is to resample the model using the Latent implementation but it is very slow and not feasible in my application.

Do you have any ideas on how to solve this? My last approach would be calculate it manually. If you could give me an input also on that it would be very helpful.

I need also to ask you another thing related to this topic. I am using pm.compute_log_likelihood on a sparse gp. My understanding is that the sparse gp implementation is done using a pm.Potential which the pm.compute_log_likelihood ignores. Is that correct?
Then how can i compute it? My guess is that the best way is to compute it manually because writing the Full Latent model and resampling would be too slow.
In theory, at least from my understanding, this pointwise log likelihood (from the sparse model) would be calculated from the approximation of the likelihood given by FITC or VFE.

Thanks again

Topic		Replies	Views
Poor estimates from pymc examples on marginal and latent GP v5 gaussian_process , modeling	2	472	May 26, 2023
GP and MC integral approximation - which logp to use? Questions	2	562	February 12, 2019
Calculating marginal likelihood of GP Model Questions	6	1286	December 31, 2018
Custom log likelihood and LOO Questions	5	1864	April 4, 2022
Bayesian Analysis of Posterior Predictions from GP model Questions	17	1383	August 10, 2020

Pointwise Log Likelihood For Gaussian Processes

Related topics