One side note by way of unsolicited advice, I’m suspicious that this line is going to give you issues:
y_ = gp.marginal_likelihood("y",X=x,y=y, noise=s+er)
I did something similar here, so make sure that behavior is what you’re going for. I think the s variable there isn’t going to do anything, I’m not certain, but I suspect the posterior will look just like the prior, or maybe it’ll essentially get as close as it can to 0. And I think that including observed noise will mean that you can’t make predictions with_noise, if that’s a goal at some point.
I’m not sure if this is possible with your experimental set up, but generally I find it’s better to provide all the individual (noisy) observations to a GP, rather than aggregate them via mean/std and trying to fit a GP to the summary statistics. But of course, YMMV.