# Limiting domain of sampling from gaussian process

I have fit a Gaussian process model that I have calibrated to some multivariate X and scalar y data which comes from an expensive (and gradient-less) function. Lets use an image from the manual as an example:

Once I have fit my Gaussian process to some data, I then compute the median values of the gaussian process parameters and use them in another pymc3 model to sample over by using gp.Latent:

``````cov_func = median_eta.reshape([1,1])**2.*hps['covfxn'](len(rvs),ls=median_ls)
gp            = pm.gp.Latent(cov_func = cov_func)
gp_cond  = gp.conditional('median_gp',Xnew=some_new_X,
given={'X':fit_gp.X,
'f':fit_gp.y,
'y':fit_gp.y,
'noise':fit_gp.noise},shape=1)
y_ = pm.Normal('y',mu=gp_cond,sd=sigma,observed=y)
``````

This works reasonably well for approximating the function that generated the original X and y data but being faster to evaluate and I can use gradient samplers.

I would like to make sure though that when I am using the Latent Gaussian process that I stay within the domain that has relatively high confidence. Using the above figure as an example, I want to make sure that I wouldnâ€™t end up sampling in the region above 10.

This is especially a problem if my observed y is close to the mean function value of my gp. The samplers then have a tendency to push my solution out into the unobserved domain.

Because my X is multivariate I canâ€™t set bounds directly. I would like to weight my likelyhood towards regions of low variance in my gp. Does anyone have any ideas?

If Iâ€™m understanding right, you are basically trying to put a prior on `gp` that puts no support on the region greater than e.g. 10. So, maybe you could restrict the gp to equal zero in the out-of-bounds region? To do this, you can modify the covariance function to equal zero in that region. See here, in the section called â€śA custom changepoint covariance functionâ€ť.

1 Like

@bwengals the only problem with that is sometimes the value that Iâ€™m converging to in my second model is zero or nearby so I have a tendency to move outside of my calibrated domain and stay there. Iâ€™ll dig through that notebook and see if I get any brainwaves though.

I had two thoughts that Iâ€™m not totally sure how to implement:

1. I could constrain my sample points such that they are required to stay within a certain norm distance of a point used to construct the gp. I tried something like:

`````` point_dist = pm.Deterministic('_distance_',tt.min([(params_ - xn).norm(len(params_)) for xn in X]))
parameter_bounds = pm.Normal('_distance_params_',\
mu=point_dist,\
sd=some_sd_of_X_distance,\
observed=some_mean_distance)
``````

but this didnâ€™t seem to fix the problem which I understand since the distribution isnâ€™t truly bound. Does anyone see how I could use my computed distance point_dist and only assign it a non-zero probability if it is less than some value?

2. Is there a way to implement a Gaussian process such that the gradient stays constant once it exits the region of support? Maybe I could do this with changepoints somehow?