Of course!
-
It might, it definitely wouldn’t hurt, but I’ve noticed things are a bit more sensitive to the Y scaling. If the scale of X is very far from [-3, 3] it could help. I do try to avoid standardizing X because it makes the lengthscale parameters a bit harder to set priors for and interpret for GPs (though you could easily rescale the lengthscales after). On the other hand, standardizing Y makes it easier (for me at least) to set priors for things like the overall amplitude (
vert
) and the different noise sources, since you know something likepm.HalfNormal('eta', sigma=1)
will probably give reasonable results as a starting point. -
So the default is 1000 steps for tuning, and then the 500 is the draws after tuning. So here it ran 3000 samples total, dropped the first 1000 of each chain, and returned the last 500 of each chain. Definitely always better to use a bigger number of draws, but for debugging / prototyping it’s better to iterate faster, so that’s the only reason I cut the number of samples.
Hope that helps