Gaussian Process regression with Automatic Relevance Determination

Nadheesh · January 24, 2019, 9:53am

Currently, I’m considering to use GP regression for a my project. I typically use following script is to train a GP regression model with PyMC3.

X // features
y // target
with pm.Model() as model:

        ℓ = pm.Gamma("ℓ", alpha=2, beta=1)
        η = pm.HalfCauchy("η", beta=5)

        cov = η ** 2 * pm.gp.cov.Matern52(X.shape[1], ℓ)
        gp = pm.gp.Marginal(cov_func=cov)

        σ = pm.HalfCauchy("σ", beta=1)
        y = gp.marginal_likelihood("y", X=X, y=y, noise=σ)

        map_trace = [pm.find_MAP()]

I’m interested in understanding, how to perform automatic relevance determination for GP regression with PyMC3. With basic regression techniques, we could have used some hyper-priors for the coefficients of features, instead of deterministic values. However, with GP regression I don’t see a way of incorporating such hyper-priors to the kernel function.

Do you have any suggestions? I’m appreciate any help on performing automatic features selection within the models to minimize the effect of the unimportant features.

Thanks!

junpenglao · January 24, 2019, 10:17am

I think you can modify the prior for ℓ and η for that purpose.

Nadheesh · January 25, 2019, 5:10am

I was initially thinking the same approach, but I cannot figure out how that could determine the relevance of the features. Aren’t we suppose to use single dimensional RVs for ℓ and η, assigning a single RV for all the features?

junpenglao · January 25, 2019, 6:39am

In general you can only easily regularise hyper parameters (priors), and for GP especially that’s what you should do. If you want specific regularization on some range of the output (eg, smaller smoothing in a specific range of Y), then you need to design your cov function.

I guess alternatively, you can also try adding a potential that takes the cov function as input.

Nadheesh · January 30, 2019, 8:56am

Thanks for the suggestion. I’m still not convince this is what I’m looking for.

I don’t want to regularize a specific range. As what I have understood, we usually use ARD to determine the importance of each feature. So in a simple case such as linear regression, we use a hyper-prior assigned to the std of each coefficient to control the effect of each feature towards the predictions.

However, ℓ and η are same for all the features according to usual GP implementation. We do not assign a single ℓ for each features, instead we divide the euclidean distance between two data samples (these operations may vary according to the kernel type) using a single dimensional RV ℓ. So how could that control the effect on individual feature differently (according to the relevance of the each feature)?

Nadheesh · January 30, 2019, 8:58am

with pm.Model() as model:
    ℓ = pm.Gamma("ℓ", alpha=2, beta=1, shape = X.shape[1]]) // I changed this line
    η = pm.HalfCauchy("η", beta=5)

    cov = η ** 2 * pm.gp.cov.Matern52(X.shape[1], ℓ)
    gp = pm.gp.Marginal(cov_func=cov)

    σ = pm.HalfCauchy("σ", beta=1)
    y = gp.marginal_likelihood("y", X=X, y=y, noise=σ)

    map_trace = [pm.find_MAP()]

What do you think will happen if I implement GP as shown above? will this define a ℓ _i for each feature since I have defined ℓ with X.shape[1] number of dimensions. (I changed the second line)

junpenglao · January 30, 2019, 11:08am

Oh, I did not understand fully what you meant before. So you have multi dimension in X (eg, 2d GP which the mean function would be a surface), then yes you can assign different ℓ and η to regularized the feature in each dimension.

Nadheesh · January 31, 2019, 7:59am

So can I achieve that by simply setting the shape of ℓ and η to X.shape[1] (number of features) ?

junpenglao · January 31, 2019, 9:14am

I think so.

Nadheesh · February 1, 2019, 4:23am

Do we have someone who is familiarize with the PyMC3 GP implementation? I wanted to verify that setting ℓ and η with a shape of X.shape[1] (number of dimensions) is gonna consider each ℓ_i for i^{th} feature during the computations?

junpenglao · February 1, 2019, 5:41am

@bwengals

bwengals · February 2, 2019, 1:28am

yessir it will. Although for ARD, eta is a scalar.

Nadheesh · February 11, 2019, 5:47am

Thanks @bwengals, I did not change the size of eta, but instead of a scalar I used a RV as shown in the scripts. I assumed that is better than using a scaler according to Bayesian.

indian312 · August 6, 2019, 3:40pm

My intuition is : use a loop like
n = no_of_l_s_required
Dummy = [‘pm.Gamma(’, ‘pm.Normal(’]

Alpha = [ Alpha_1, Alpha_2, ……Alpha_n] #expected values
Beta = [ Beta_1, Beta_2, ……Beta_n] #expected values

Mu = [ Mu_1, Mu_2, ……Mu_n] #expected values
Sd = [ Sd_1, Sd_2, ……Sd_n] #expected values

L = []

for i in range():

  L.append('l[' + str(i) + ']'+ '=' +  Dummy[1] + 'l[' + str(i) + '],' + 'mu =' + str(Mu[i]) + ',' + str(Sd[i]) + ')')

one can also use a conditional to change it to other distribution when required

exiting the loop

evaluating the ‘L’ . usually not recommended for a code from suspicious sources

eval(L)

###############################

Please let me know if it is what you might like to use in your code.

Nadheesh · August 13, 2019, 10:49am

Thanks @indian312,

Don’t need a loop, just by setting shape of length scale as X.shape[1] (number of features) we can achieve that.

ℓ = pm.Gamma("ℓ", alpha=2, beta=1, shape = X.shape[1]])

Topic		Replies	Views
Sparse GP with linear covariate? Questions	2	597	August 3, 2021
Using Gaussian Process model to make inference? v5 gaussian_process	1	660	February 5, 2023
GP and Bayesian linear regression in PyMC Questions	8	1404	August 18, 2018
Guassian process regression, Example 1: attribute GP nor found Questions	2	938	February 19, 2022
GP model and prior prediction analysis gaussian_process , modeling	1	419	June 15, 2022

Gaussian Process regression with Automatic Relevance Determination

one can also use a conditional to change it to other distribution when required

exiting the loop

evaluating the ‘L’ . usually not recommended for a code from suspicious sources

Related topics