Bayesian calibration on GP model

I just started discovering pyMC few days ago, so please forgive me if my question is really basic, I couldn’t find or understand the answer yet.

My objective is very straightforward, I have X and y values from a given code, on which I can build a GP model. I usually use GPy, and get maximum likelihood hyperparameters, yet why not remain fully Bayesian in pyMC. Once this GP is built (fixed hyperparams and conditioned), I wish to solve an inverse problem in a Bayesian manner. I have an experimental y_exp value, and would like to get a posterior distribution of x_exp. In short, with my current GP model accuracy, how certain am I on the calibrated x_exp.

This is were I struggle, Can this be done seemlessly in pyMC3, with an “observed” argument somewhere (in gp.conditional or predict ?) ? Can this be solved with gradient-based samplers ? Or should I fall back to a black-box likelihood that I would calculate myself ?

Thanks a lot in advance

From your description I think it should be possible. This thread may be helpful to get you started, Bayesian model calibration with Gaussian Process - #18 by kejzlarv

1 Like

Thank you for the link. I had already seen this post but couldn’t quite find what was missing in my understanding. After going over it again, I think what I was looking for was the gp.marginal_likelihood function, that “tells” that some values were observed. Giving a free variable to the X argument would then allow to recover it, which is my objective.

However, I now struggle on shapes and theano conversion. In my toy example, I have a 1D GP, with a scalar observed value and a scalar parameter theta to recover. It seems that I can’t find how to properly give theta to the gp.marginal_likelihood function. Either the number of dimensions is too low or theano doesn’t want object typed variables.

Here is the example:

import numpy as np
import pymc3 as pm
import theano.tensor as tt

def f(x):
    return np.sin(10*x)

theta_train = np.linspace(0,1,7)
theta_true = 0.8

f_train = f(theta_train)
f_true = f(theta_true)

with pm.Model() as model:
    
    theta = pm.Uniform("theta", lower = 0, upper = 1)
    sigma = pm.HalfNormal("sigma", sd = 1)
    l_scale = pm.Gamma("l_scale", alpha= 1, beta = 5, shape = 2)
    s_n = pm.HalfNormal("s_n", sd = 1)
    
    cov = pm.gp.cov.Matern52(1, l_scale)
    
    gp = pm.gp.Marginal(cov_func = cov)
    
    theta_all = np.concatenate((theta_train,[theta]))
    f_all = np.concatenate((f_train,[f_true]))
    
    y1 = gp.marginal_likelihood("y1",
                                X=theta_all.reshape(-1,1),
                                y=f_all.reshape(-1,1),
                                noise=s_n)

Again I’m sorry if the answer only relies on basic theano shape handling.
Thank you for your lights !

Hi, if you new to PyMC, it may be better to use PyMC v4 with Aesara. The new PyMC version can handle shape better, and there are a lot of improvement in Aesara compared to Theano.

For the shape handling in PyMC3, you can check this notebook: gp_regression/spawning_salmon.ipynb at master · fonnesbeck/gp_regression · GitHub

I converted the above notebook into PyMC v4 here: gp_experiments/gp_salmon_pymc_v4.ipynb at main · danhphan/gp_experiments · GitHub

For your exmaple, in this line of codes:

    y1 = gp.marginal_likelihood("y1",
                                X=theta_all.reshape(-1,1),
                                y=f_all.reshape(-1,1),
                                noise=s_n)

Maybe X should be shape (n,1) while y should be shape (n,) but not (n,1).

2 Likes

Thank you for your answer, I will gladly have a look at these notebooks. I will try PyMC v4 when possible, to compare with PyMC3.

Yet in my case, I mostly face this error:

TypeError: Unsupported dtype for TensorType: object

which, in my understanding, is caused by the fact that I put a pymc3 free variable in the X argument.

Unfortunately, I haven’t been able to solve my issue yet. Has anyone already given pyMC variables to X (concatenated with known values) in the marginal_likelihood method in pyMC3 ?
Many thanks !

Hi there MickaRiv,

I was able to get it to run with a few minor tweaks to your code. I only have PyMC v4 installed, so I switched to aesara and there were a few small syntax differences. Apart from that, I fiddled with the reshapes a bit and used pm.math.concatenate for the theta concatenation. I think the latter was the key thing needed to get it to work.

import numpy as np
import pymc as pm
import aesara.tensor as at

def f(x):
    return np.sin(10*x)

theta_train = np.linspace(0,1,7)
theta_true = 0.8

f_train = f(theta_train)
f_true = f(theta_true)

with pm.Model() as model:
    
    theta = pm.Uniform("theta", lower = 0, upper = 1)
    sigma = pm.HalfNormal("sigma", 1)
    l_scale = pm.Gamma("l_scale", alpha= 1, beta = 5, shape = 2)
    s_n = pm.HalfNormal("s_n", 1)
    
    cov = pm.gp.cov.Matern52(1, l_scale)
    
    gp = pm.gp.Marginal(cov_func = cov)
    
    theta_all = pm.math.concatenate((theta_train,[theta]))
    f_all = np.concatenate((f_train,[f_true]))
    
    y1 = gp.marginal_likelihood("y1",
                                X=theta_all.reshape((-1,1)),
                                y=f_all.reshape((-1,)),
                                noise=s_n)

Hope this works for you and is helpful!

3 Likes

Hi Martin_Ingram,
Thanks a lot for your answer. I took some time to test this on my side, but it indeed works like a charm.
pm.math.concatenate does seem to be the key here.
It also works properly with pyMC3.

Thanks again !

2 Likes