Kennedy O'Hagan implementation, bayesian modell calibration

Hi there,

I’m quite new to probabilistic programming and pymc3. But what I’ve seen so far looks very cool. Thank you for providing this great toolbox!

Currently, I want to implement the Kennedy-O’Hagan framework in pymc3.
As I could see some people did the same thing before:
https://discourse.pymc.io/t/bayesian-model-calibration-with-gaussian-process/948
https://discourse.pymc.io/t/problem-with-multiprocessing-in-pymc3/1440

The setup is as follows (copied from kejzlarv’s post):

I have n observations z_i of the form

z_i = f(x_i, \theta) + g(x_i) + e_i,

where x_i are known imputs and \theta are unknown calibration parameters and e_i are iid error terms. We also have m model evaluations y_j of the form

y_j = f(x^*_j, \theta_j), where both x^*_j (different than x_i above) and \theta_j are known. Therefore, the data consists of all z_i and y_j. In the paper, Kennedy-O’Hagan model f, g using gaussian processes:

f \sim GP\{m_1(.,.), \Sigma_1[(.,.),(.,.)] \}

g \sim GP\{m_2(.), \Sigma_2[(.),(.)] \}.

Among other things, the goal is to get posterior samples for the unknow calibration iparameters \theta.

I tried to implement the framework and in the beginning it seemed easy but now I am confused. Would be great if you could give me some advise! The tricky part for me is the formulation of the marginal_likelihoods…

Here is what I did so far:

    import pymc3 as pm
    import numpy as np
    import matplotlib.pyplot as plt
    from multiprocessing import freeze_support
    import sys
    import theano
    import theano.tensor as tt
    from mpl_toolkits.mplot3d import Axes3D

    def physical_system(x):
        return 0.65 * x / (1 + x / 5)

    def observation(x):
        return physical_system(x[:]) + np.random.normal(0,0.01,len(x))

    def computational_system(input):
        return input[:,0]*input[:,1]

    if __name__ == "__main__":
        freeze_support()
        # observations with noise
        x_obs = np.linspace(0,4,10)
        y_real = physical_system(x_obs[:])
        y_obs = observation(x_obs[:])

        # computation model
        N = 20
        x_comp = np.random.uniform(0,4,N)[:,None]
        t_comp = np.random.uniform(0,1,N)[:,None]
        input_comp = np.hstack((x_comp,t_comp))
        y_comp = computational_system(input_comp)


        x_obs_shared = theano.shared(x_obs[:, None])

        with pm.Model() as model:
            theta = pm.Normal('theta', mu=0, sd=10)
            noise = pm.HalfCauchy('noise',beta=5)
            ls_1 = pm.Gamma('ls_1', alpha=1, beta=1, shape=2)
            cov = pm.gp.cov.ExpQuad(2,ls=ls_1)
            gp1 = pm.gp.Marginal(cov_func=cov)
            gp2 = pm.gp.Marginal(cov_func=cov)
            gp = gp1 + gp2
            input_1 = tt.concatenate([x_obs_shared, tt.tile(theta, len(x_obs), ndim=2).T], axis=1)
            f_0 = gp1.marginal_likelihood('f_0', X=input_comp, y=y_comp, noise=noise)
            f_1 = gp1.marginal_likelihood('f_1', X=input_1, y=y_obs, noise=noise)
            f = gp.marginal_likelihood('f', X=input_1, y=y_obs, noise=noise)

            trace = pm.sample(1000,chains=1,step=pm.Metropolis())
        pm.traceplot(trace)
        plt.show()
Summary

This text will be hidden

1 Like

First two things to try: would definitely recommend using the default NUTS instead of Metropolis, and set cores=1 in pm.sample whenever using the GP submodule.

Also, the model @kejzlarv uses looks like:

gp1 = pm.gp.Marginal(mean_func = mean_fcn, cov_func = cov_fcn)
gp2 = pm.gp.Marginal(mean_func = mean_fcn, cov_func = cov_fcn)

y_1 = gp1.marginal_likelihood("y1", X=input_1, y=y1, noise=σ)
y_2 = gp2.marginal_likelihood("y2", X=input_2, y=y2, noise=σ) 

which is a little different than what you have. Maybe these things help?

3 Likes

@bwengals thanks for your reply!
I want to recreate the example described in this paper: https://www.researchgate.net/publication/278156450_Learning_about_physical_parameters_The_importance_of_model_discrepancy
What I did so far seems to work, but I am not 100% sure if its correct.
First I build up a surrogate model of a simulation output with a gp. Then I fit this gp to my observation data and quantify the model discrepancy simultaneously with a second gp.
Here is my code:

import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
from multiprocessing import freeze_support
import sys
import theano
import theano.tensor as tt
from mpl_toolkits.mplot3d import Axes3D
import pyDOE
from scipy.stats.distributions import uniform
import pickle
import os
 
def physical_system(x):
   return 0.65 * x / (1 + x / 20)

def observation(x):
   return physical_system(x[:]) + np.random.normal(0,0.01,len(x))

def computational_system(input):
   return input[:,0]*input[:,1]

if __name__ == "__main__":
  freeze_support()

  # observations with noise
  x_obs = np.linspace(0.0,4,100)
  y_real = physical_system(x_obs[:])
  y_obs = observation(x_obs[:])
  t_obs = np.full((100, 1), 0.5)
  inp_obs_new = np.hstack((x_obs[:, None], t_obs))

  # computation model
  N = 50
  design = pyDOE.lhs(2, samples=N, criterion='center')
  left = [-0.2,-0.2]; right = [4.2,1.2]
  for i in range(2):
     design[:,i] = uniform(loc=left[i],scale=right[i]-left[i]).ppf(design[:,i])
  x_comp = design[:,0][:,None]; t_comp = design[:,1][:,None]
  input_comp = np.hstack((x_comp,t_comp))
  y_comp = computational_system(input_comp)

  with pm.Model() as surrogate_model:
    ls_1 = pm.Gamma('ls_1', alpha=1, beta=1, shape=2)
    cov = pm.gp.cov.ExpQuad(2, ls=ls_1)
    gp = pm.gp.Latent(cov_func=cov)
    f_1 = gp.prior('f_1',input_comp)
    y = pm.Normal('y',mu=f_1,observed=y_comp)
    trace = pm.sample(1000,chains=1,init="adapt_diag",progressbar=True)

summary = pm.summary(trace)
l_means = np.array([summary['mean'][key] for key in summary['mean'].keys() if 'ls_1' in key]).reshape(1, 2)
noise_means = [summary['mean'][key] for key in summary['mean'].keys() if 'noise' in key]

input_comp_ = theano.shared(input_comp[:, None])
with pm.Model() as model:
    ls_ = l_means
    cov = pm.gp.cov.ExpQuad(2, ls=ls_)
    gp1 = pm.gp.Latent(cov_func=cov,)
    theta = pm.Normal('theta', mu=0.4, sd=0.3)
    input_1 = tt.concatenate([x_obs_shared, tt.tile(theta, len(x_obs[:, None]), ndim=2).T], axis=1)
    givens = gp1._get_given_vals({'X': input_comp, 'f': y_comp})
    mu = pm.Deterministic('mu', gp1._build_conditional(input_1, *givens)[0])
    ls_2 = pm.Gamma('ls_1', alpha=1, beta=1, shape=1)
    cov_2 = pm.gp.cov.ExpQuad(1, ls_2)
    gp2 =pm.gp.Latent(cov_func=cov_2)
    delta = gp2.prior('delta', X=x_obs[:,None])
    y_ = pm.Normal('y_', mu=mu+delta, observed=y_obs)
    trace_ = pm.sample(1000, cores=4, chains=1, tune=1000, target_accept=0.95)

Now, I want to implement constraints for my discrepancy gaussian process “gp2” like in the paper. I want to define constraints like delta(0) = 0 or delta’(0) = 0 (first derivative) but I don’t see a way to find a proper formulation and found nothing useful in this forum… Could somebody help?

This type of problem (Kennedy-O’Hagan) would be a really useful example if the development team had the free energy to clean it up and put it in the Examples gallery.

1 Like