Dear All,
I am a beginner and learning to apply Gaussian processes. I find in pymc3 the periodic kernel is implemented through:
k(x, x') = \mathrm{exp}\left( -\frac{\mathrm{sin}^2(\pi |x-x'| \frac{1}{T})}{2\ell^2} \right)
However, in other sources including David Duvenaud’s Kernel Cookbook, as well as one place in pymc3 documentation - the formula is given by:
k(x, x') = \exp\left( -\frac{2 \sin^{2}(\pi |x - x'|\frac{1}{T})}{\ell^2} \right)
I am a little confused here. Does this difference (placement of “2”) change the interpretation of the lengthscale (\ell)? Or am I simply missing something?
Thank you!
1 Like
Ooh, nice catch! PyMC3’s source code does seem to use the first option. To add to the apparent confusion, it seems like we’re using another slightly different implementation in PyMC4:
k(x, x') = \sigma^2 \mathrm{exp}\left( -\frac{2\mathrm{sin}^2(\pi ||x-x'||^2 \frac{1}{T})}{\ell^2} \right)
I think getting @bwengals and @tirthasheshpatel perspectives would be interesting here!
You are right. PyMC3’s implementation is a little different. I don’t think that should make much difference during inference but the interpretation of length scales does change which is not a good thing. IMO, for the sake of correctness, this should be changed to the equation below. Still, @bwengals has more experience with GPs so he would be your best guide. As this is a bug, you can open an issue on the Github repo and even resolve it in a PR, if you wish.
1 Like
Yes, true! It would change the interpretation of the lengthscale. A constant factor there will change how you think of the lengthscale and what priors you’d set on it.
The PyMC3 covariance functions were largely based on GPFlow, which does use the (apparently nonstandard) -0.5 convention.
It would definitely be best to use the standard & most common formula, but I worry changing it could lead to subtle bugs in users’ models that have Periodic in them… How are issues like this handled in PyMC? Should we file an issue? PyMC4 should definitely use the most common convention though, to avoid issues like this down the road!
2 Likes
Should we file an issue?
Yeah! filing an issue seems like the way to go.
PyMC4 should definitely use the most common convention though
Yeah, PyMC4 uses standard representation.
1 Like
Thanks for the clear answers guys – this actually enlightens my own learning path on GPs, and could a problems we’re having porting chp 14 of Rethinking 2
I will file an issue ASAP! Unless @petro_sampler wanna do it?
One question though: what do you guys mean by “the most common convention”? PyMC4’s (the one in my first post), or the second option in @petro_sampler’s post?
Hi Alex, if you can file the issue that would be great - thank you!
And based on my (little) understanding, you might not want to square the |x-x'| term.
Thanks guys! Much appreciated.
1 Like