For my Gaussian process I have data training points which have x data points which are a minimum distance of 1 apart and maximum 5 apart. For choosing a length scale prior where would I want the pdf to be weighted most.
I have experimented with these so far, they are both inverse gammas, but which would be more appropriate? Or would something else entirely be better?
Further in my project these distances between points will increase slightly, will I need to change the function being used? Or I believe in my code I use the sample function to optimise the values used, will this be sufficient?
Try to get most of the probability mass between 1 and 5. I’ve found this page to be a good resource for thinking about the length scale. A uniform distribution over 1 to 5 could be an okay choice here, if you want something simple and quick.
further in my project these distances between points will increase slightly, will I need to change the function being used? Or I believe in my code I use the sample function to optimise the values used, will this be sufficient?
If you want to retrain your model on all the data points, you should probably adjust your prior as well in case the interpoint distances change. If you are just using the GP to predict values at new data points, then as long as the lengthscale estimate isn’t close to 5 or greater, then it is probably okay to rely on our old posterior estimates.