Scaling covariance function in Gaussian Processes


This one is directed to @bwengals since it’s a direct question on one of his notebooks.

In particular there is a scaling RV σsq_gp which is multiplied by the overall covariance function.
Can you expand on the choice for this variable and why it was used?
Below there is additional context for the model.

# define positive normal distribution with a lower bound
BoundedNormal = pm.Bound(pm.Normal, lower=0)
# define gp model in pymc3
with pm.Model() as conjugate_gp:
    σsq_gp = pm.HalfCauchy("σsq_gp", beta=2)  # scales the covariance function    
    ℓ = BoundedNormal("ℓ", mu=0.5, sd=1)      # lengthscale of the covariance function (weakly informative)

    k = σsq_gp *, ℓ) # the covariance function


That parameters role is to scale the covariance function. It controls the magnitude of the GP. You can try setting up a GP prior with different values of that parameter and see what the function samples look like to see how it affects things.

I’d want to refer you to the pymc3 documentation and examples, the code on that blog is extremely out of date.

Thanks for the input.
I’m somewhat familiar with the examples on the pymc3 docs. Great stuff by the way.

I was wondering if the scaling factor was following a formal rule to control the GP.

Yes it is. It comes from the rules for constructing kernels. Multiplying a kernel (or covariance function) by some function a(x) is a valid kernel: a(x) k(x, x') a(x'), so to design a changepoint a(x) you can use a sigmoid function for a(x).

1 Like