Bayesian model calibration with Gaussian Process

I am far from an expert on Pymc3 but does it make sense to stack your x_i and \theta values into one large X vector to pass into your GP? That way all of your x_i and \theta values are inputs.

Something like

import theano.tensor as tt

n = number_of_datapoints
theta1_vals = np.random.normal(loc=theta1_mu,scale=theta1_sd,size=n)
theta2_vals = np.random.normal(loc=theta2_mu,scale=theta2_sd,size=n)
.
.
.

thetas = zip(theta1_vals,theta2_vals,...)

y1 = [some_function(x,theta1,theta2) for theta1,theta2,... in thetas]

X = np.vstack([np.hstack([x,theta]) for theta in thetas])

with pm.Model() as model:
    .
    .
    .

    f1 = gp1.marginal_likelyhood('y',X = X, y=y1,noise=sigma_1)
    .
    .
    .

Another thought, if you never need to vary x_i after you have created your training set maybe your gp only needs to be a function of \theta even though x_i was used in its creation. Could you just do:

import theano.tensor as tt

n = number_of_datapoints
theta1_vals = np.random.normal(loc=theta1_mu,scale=theta1_sd,size=n)
theta2_vals = np.random.normal(loc=theta2_mu,scale=theta2_sd,size=n)
.
.
.

thetas = zip(theta1_vals,theta2_vals,...)

y1 = [some_function(x,theta1,theta2) for theta1,theta2,... in thetas]

X = np.vstack(thetas)

with pm.Model() as model:
    .
    .
    .

    f1 = gp1.marginal_likelyhood('y',X = X, y=y1,noise=sigma_1)
    .
    .
    .

Your gaussian process model would be len(y) long so you never need to use x_i when you call it. The first approach would let you make x_i uncertain though which I could see being helpful sometimes.

1 Like