Try to get most of the probability mass between 1 and 5. I’ve found this page to be a good resource for thinking about the length scale. A uniform distribution over 1 to 5 could be an okay choice here, if you want something simple and quick.
further in my project these distances between points will increase slightly, will I need to change the function being used? Or I believe in my code I use the sample function to optimise the values used, will this be sufficient?
If you want to retrain your model on all the data points, you should probably adjust your prior as well in case the interpoint distances change. If you are just using the GP to predict values at new data points, then as long as the lengthscale estimate isn’t close to 5 or greater, then it is probably okay to rely on our old posterior estimates.