Hello,
I’m working with my own biological model that is relatively slow to run, and has a relatively high number of input parameters (~8-15 depending on approach), as well as one input variable (calcium concentration), to predict one output variable (force). Ultimately, I’m looking to carry out some kind of uncertainty quantification as I fit my parameters to experimental measurements, but because my biological model is relatively slow, I am hoping to use a Gaussian Process surrogate model to help carry out this analysis.
I believe that I should be able to use a marginal GP, rather than a latent GP in this case because I can assume normal likelihood, and I’m not carrying out any additional components to the model.
I’ve create a simplified 2D surrogate model below that only has one input parameter (nH) and one input (pCa) which seems to be behaving as expected, but I wanted to check with experts if using a marginal GP in this surrogate fitting approach seems reasonable.
with pm.Model() as gp_model_2D:
# Priors for kernel hyperparameters
# Lengthscale: how quickly the function varies
ell = [pm.Gamma('ell0', alpha=2, beta=1),
pm.Gamma('ell1', alpha=2, beta=1)]
# Amplitude: typical scale of function values
eta = pm.HalfNormal('eta', sigma=3)
# Covariance function
cov_func = eta**2 * pm.gp.cov.ExpQuad(input_dim=2, ls=ell)
# mean function
mf = pm.gp.mean.Constant(0.5)
# GP prior with zero mean
gp = pm.gp.Marginal(cov_func=cov_func, mean_func=mf)
# Noise variance prior
sigma = pm.HalfNormal('sigma', sigma=0.01)
# Marginal likelihood
y_obs = gp.marginal_likelihood('y_obs', X=X_data, y=y_data, sigma=sigma)
# Sample from posterior
trace_gp = pm.sample(draws = 1000, tune=1000, random_seed=43, return_inferencedata=True, nuts_sampler='nutpie',
cores = 16)
Of course the next step will be to actually have another instance of the model where I am actually fitting data to the experimentally observed values and estimating parameter values like nH.
Thanks so much for any feedback or suggestions. I’m also comparing this approach to using a neural network surrogate model, and would be curious if anyone has suggestions between the two.