Combining GP and PyMC3

Hello, I am a just learning Python, Gaussian Processes and Bayesian calibration, so here comes a pretty basic question.

I created some simulation values sim_results, which represents the outputs from a model simulation.
I created a simple “Meta-model” with Gaussian process regression, in order to be able to recall values from there when I am doing the Bayesian calibration. So when determining the likelihood in the PyMC3, how should I detemine mu in order to refer to the Gaussian process regressor?

likelihood = pm.Normal(“y”, mu=mean, sigma=std, observed=y_obs)

What I do not understand is basically how you connect these two parts, the regression model and the bayesian calibration.


CODE:

import numpy as np

#Generating x-values
x_values = np.linspace(start=1, stop=3, num=10).reshape(-1,1)

#SIMULATION REULTS_____________________________________________________________
#Generating simulation results
calculation = np.squeeze(660 + 1398*x_values)
rng = np.random.RandomState(1)
noise_std = 100
sim_results = calculation + rng.normal(loc=0.0, scale=noise_std, size=calculation.shape)
print("Simulation results: ", sim_results)


#META-MODEL____________________________________________________________________
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF

noise_std = 100

kernel = 1 * RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e2))
gaussian_process = GaussianProcessRegressor(kernel=kernel, alpha=noise_std**2, n_restarts_optimizer=9)
gaussian_process.fit(x_values, sim_results)
mean_prediction, std_prediction = gaussian_process.predict(x_values, return_std=True)

#Get sim_result for new u_values
X_test = np.array([1.15,2.15,2.89])
x = X_test.reshape(-1,1)
print("Test: ", x)
m, s = gaussian_process.predict(x, return_std=True)
print("Mean: ", m, "std: ",s)


#BAYESIAN CALIBRATION__________________________________________________________
import seaborn as sns
import pymc3 as pm

y_obs = [2861, 2012, 1990, 2250, 1873, 1920, 1988, 2136]

with pm.Model() as model:
        #Priors
    prior_x = pm.Normal("prior_x", mu=1.5, sigma=0.5)
        #Likelihood
    mean, std = gaussian_process.predict(prior_x, return_std=True)
    likelihood = pm.Normal("y", mu=mean, sigma=std, observed=y_obs)
        #Posterior
    trace = pm.sample(1000, cores=4)
        
plt.figure(figsize=(7, 7))
pm.traceplot(trace)
plt.tight_layout()

for var in ['prior_x']:
    plt.figure(figsize=(10,4))
    vals = trace.get_values(var)
    mean, lower, upper = round(vals.mean(),2), round(vals.mean()-vals.std(),2), round(vals.mean()+vals.std(),2)
    sns.distplot(vals)
    posterior_est = plt.axvline(mean, color='b')
    plt.axvline(lower, color='r', linestyle='--')
    plt.axvline(upper, color='r', linestyle='--')

    plt.show()
1 Like

Hello,

I’m also very new here, but I can direct you to my post : Bayesian calibration on GP model.

In hope that it yields some useful information :slight_smile:

In short, you may want to build your GP within PyMC, using marginal_likelihood for specifying x and y values. I don’t know if sklearn GP can be incorporated as is in PyMC ecosystem (notably for gradient computation), but you can perhaps relapse to black-box likelihoods (Using a “black box” likelihood function — PyMC3 3.11.5 documentation)

2 Likes