How to put a restriction on the value of hyper-parameters in Gaussian process

Hi Jiadaren!

Many people point to Michael Betancourt’s blog post on exactly this question. You should read through the whole thing, he gives a very good overview of the problem, but the essence of it is he suggests putting 99% of the (Inverse Gamma) prior density on the lengthscale between the shortest and longest pairwise distance in your dataset. My code for doing this looks something like:

from scipy.spatial.distance import pdist

distances = pdist(X)

ℓ_l = distances[distances!=0].min()
ℓ_u = distances[distances!=0].max()
ℓ_σ = (ℓ_u-ℓ_l)/6
ℓ_μ = ℓ_l + 3*ℓ_σ

with pm.Model() as model:
    ℓ = pm.InverseGamma('ℓ', mu=ℓ_μ, sigma=ℓ_σ)
    η = pm.Gamma('η', alpha=2, beta=1)
    ...

Note that this doesn’t chop the prior off at the lower bound, but it does make it much more likely that your MAP will be above it.

6 Likes