GP model and prior prediction analysis

gil_kanfer · June 13, 2022, 3:36pm

Hi,

My first steps of using pymc3 were very Successful. However, now I got into the more complicated aspects of Bayes using pymc3.

I aim to train a classifier (using the Binomial model), six predictors, and a gaussian process for integrating the information from a distance map.

For assessing the feasibility of the project, I have genreated training data:
predictors:

x = np.array((0.06,.01,.02,0.7,0.9,0.8))

Output:
corresponding to six classes:

y =  np.array((0, 0, 0, 1, 1, 1))

Extra information is extracted from the distance map (column are corresponidng to the order of the predictors and output:

Dmat = np.array([[0,1,2,8,3,7],
                             [1,0,3,9,4,7],
                             [2,2,0,5,3,4],
		             [8,9,5,0,2,1],
		             [3,4,3,2,0,2],
		             [7,7,4,1,2,0]])

I will need some help to reverse engineer the model for troubleshooting.

with pm.Model() as gauess_sim_test:
    # hyper-priors
    
    etasq = pm.Exponential("etasq", 10)
    ls_inv = pm.HalfNormal("ls_inv", 1)
    rhosq = pm.Deterministic("rhosq", 0.5 * ls_inv**2)

    # Implementation with PyMC's GP module:
    cov = rhosq **2 * pm.gp.cov.ExpQuad(input_dim=1, ls_inv=ls_inv)
    gp = pm.gp.Latent(cov_func=cov)
    k = gp.prior("k", X=Dmat)
    
    p = pm.Deterministic("p", pm.invlogit(k*x))
    prob = pm.Binomial("prob", 1, p, observed=y)
    
    step = pm.Metropolis()
    trace = pm.sample(
        1, step, progressbar=True
    )  # Make sure not to draw too many samples

I want to apply the cov function to select the most probable priors. It is unclear how to retrieve the output of the function (pm.gp.cov.ExpQuad) ?
In addition, the next step is not clear to me…what is the output of pm.gp.Latent? In the last GP step, priors are drawn from the multiplication of the calculated cov matrix and the distance matrix (X). Is it correct?

The model was built on the base of the Statistical rethinking book (which I love), second edition page 469 (Tools example), and the awesome GitHub support of the book (pymc-resources/Chp_14.ipynb at main · pymc-devs/pymc-resources · GitHub) code 14.39.

I hope I was clear, and thank you for this excellent forum.

bwengals · June 15, 2022, 7:12pm

how to retrieve the output of the function (pm.gp.cov.ExpQuad) ?

import pymc as pm
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 10, 100)

etasq = 1
cov = etasq * pm.gp.cov.ExpQuad(1, ls=2)

# .eval() evaluates the aesara op.  There will be a pause the first time as aesara compiles the function
Kxx = cov(x[:, None]).eval()  

plt.imshow(Kxx);

what is the output of pm.gp.Latent?

Just the GP class. The actual GP you’re probably interested in comes from the call to gp.prior.

In the last GP step, priors are drawn from the multiplication of the calculated cov matrix and the distance matrix (X). Is it correct?

No, not quite… I think you’d say that a GP is the prior over your function k. Samples from the prior are drawn from a multivariate normal with mean 0 and covariance function cov.

Small things:

I think you’re using rhosq as the scale parameter in cov when it should be etasq instead.
Would def recommend sticking with the default NUTS as your sampler here. Metropolis without a carefully tuned proposal distribution will not sample a GP well.

Topic		Replies	Views
Vanilla implementation of Gaussian Process in PyMC3? Questions	16	1123	January 31, 2025
[PyMCon Web Series 03b] Gaussian Processes Live Q&A (Mar 22, 2023) PyMCon Web Series gaussian_process	2	503	April 13, 2023
Multivariate Gaussian Process Classification (GPC) and Prediction Questions	2	901	February 23, 2020
Latent gaussian process prediction problem Questions gaussian_process	3	1078	January 25, 2022
Using Gaussian Process model to make inference? v5 gaussian_process	1	666	February 5, 2023

GP model and prior prediction analysis

Related topics