GP and MC integral approximation - which logp to use?

kejzlarv · February 11, 2019, 7:58pm

Hi there everyone,
I am a little confused about how “logp” works for model object with pymc3. I have a simple model for dataset y_1, \dots, y_n given by y \sim GP(0, k_\theta(x,x')) . I am trying to approximate the integral

p(y_1, \dots, y_n) = \int_\theta p(y_1, \dots, y_n|\theta)p(\theta)d\theta

using the MC method, i.e I need to evaluate the likelihood p(y_1, \dots, y_n|\theta) and I am a little confused how to get what I am looking. My model is:

alpha = 2
priors = {"eta": ["gamma", alpha, 1],
          "rho": ["gamma", alpha, 1]}
with pm.Model() as gp_mass:
    
    eta1 = pm.Gamma("eta1", alpha = priors["eta"][1], beta = priors["eta"][2], transform=None)
    rho1 = pm.Gamma("rho1", alpha = priors["rho"][1], beta = priors["rho"][2], shape = 2, transform=None)
        
    # gp defintion
    # mean
    gp_mean_1 = pm.gp.mean.Zero()

    # cov
    gp_cov_1 = eta1 ** 2 * pm.gp.cov.ExpQuad(X1.shape[1], ls = rho1, active_dims = [1, 2])
    
    # Gp def
    gp_model_1 = pm.gp.Marginal(mean_func = gp_mean_1, cov_func = gp_cov_1)
    y_gp_1 = gp_model_1.marginal_likelihood("y1", X = X1, y = y1, noise = 0.01, transform=None)

I am having “transform = None” everywhere because in https://github.com/pymc-devs/pymc3/issues/789 it says that that is the way to get the correct likelihood evaluation (I need it exactly not up to a constant). This is how I approximate the integral

n_mc = 10000
np.random.seed(0)
   
with pm.Model() as priors_model:
    
    if priors["eta"][0] == "gamma":
        eta1 = pm.Gamma("eta1", alpha = priors["eta"][1], beta = priors["eta"][2], transform=None)
    
    if priors["rho"][0] == "gamma":
        rho1 = pm.Gamma("rho1", alpha = priors["rho"][1], beta = priors["rho"][2], shape = 2, transform=None)
        
    trace_priors = pm.sample(n_mc, tune = 10000, chains = 1)
    
log_likelihood1 = np.empty(0)
mc_integral1 = np.empty(n_mc)

log_likelihood2 = np.empty(0)
mc_integral2 = np.empty(n_mc)
    
logp_1 = y_gp_1.logp
logp = gp_mass.logp

for i in tqdm(range(n_mc), desc = "Log likelihood eval"):
    log_likelihood1 = np.append(log_likelihood1, logp_1(trace_priors[i], transfrom = None))
    log_likelihood2 = np.append(log_likelihood2, logp(trace_priors[i], transfrom = None))

for i in tqdm(range(n_mc), desc = "Integral calc"):
    mc_integral1[i] = (np.sum(np.exp(log_likelihood1[:(i + 1)]))) / (i + 1)  
    mc_integral2[i] = (np.sum(np.exp(log_likelihood2[:(i + 1)]))) / (i + 1)

I am getting different results based on gp_mass.logp and y_gp_1.logp which is maybe not all that surprising but which of these densities is the log of p(y_1, \dots, y_n|\theta)?

Thanks!

kejzlarv · February 11, 2019, 10:56pm

I see, based on https://docs.pymc.io/developer_guide.html#logp-and-dlogp, y_gp_1.logp is the one I am looking for. Am I right?

junpenglao · February 12, 2019, 5:57am

I think it should be gp_mass.logp, as your integral has the priors of the parameters in it that you need to also include.

Also, marginal logp is difficult to compute, you can have a look of Bridge sampling

Topic		Replies	Views
How to calculate log posterior of a GP over a trace in pymc3 Questions	14	2264	February 2, 2018
Pymc3.model likelihood evaluation for given prior sample Questions	2	536	June 22, 2018
Pointwise Log Likelihood For Gaussian Processes v5	5	155	February 11, 2025
Log-scale transform v5 development , theano , prior , modeling , pytensor	16	683	July 18, 2023
Evaluating log posterior (or log prior/log likelihood) of model at arbitrary inputs v5 development	1	494	August 29, 2022

GP and MC integral approximation - which logp to use?

Related topics