I am in the process of translating models I have written in stan into pymc. I am not sure what the most efficient way to implement the following model is:
functions {
vector biased_model(vector ach_v, real P, real K, int N){
vector[N] model_out;
model_out = ((ach_v * P) ./ (ach_v + K)) + (ach_v * 0.015);
return model_out;
}
}
// The input data is a vector 'y' of length 'N'.
data {
int<lower=0> N; // Number of observations
vector[N] y; // Observations of I/O ratio
real ach[N]; // ACH measurements
}
transformed data{
vector[N] ach_v = to_vector(ach);
}
parameters {
real<lower=0> sigma;
real<lower=0> P;
real<lower=0> K;
real<lower=0> rho;
real<lower=0> alpha;
}
// The model to be estimated. We model the output
// 'y' to be normally distributed.
// The mean of this distribution is the result of
// applying the steady state equation on the measured ACH values
// given the *sampled* parameters P and K.
// The standard deviation is the noise captuered by sigma.
model {
matrix[N, N] L_K;
matrix[N, N] K_cov = gp_exp_quad_cov(ach, alpha, rho);
real sq_sigma = square(sigma);
// diagonal elements
for (n in 1:N) {
K_cov[n, n] = K_cov[n, n] + sq_sigma;
}
L_K = cholesky_decompose(K_cov);
// priors
rho ~ inv_gamma(5, 5);
alpha ~ normal(0,2);
P ~ normal(0.85, 0.2);
K ~ normal(0.12, 0.05);
sigma ~ exponential(10);
y ~ multi_normal_cholesky(biased_model(ach_v, P, K, N), L_K);
}
What I implemented in the above is an example of Bayesian computer model calibration, as described in Higdon et al. (2004), where the computer model is assumed to be fast and biased. The purpose of the GP model is to capture the discrepancy (\delta) that exists between the true noisy observations (y) and computer model (\eta):
where x represents the known explanatory variables, \theta represents the uknown calibration parameters, and \epsilon is noise assumed to idd.
I did some reading on the pymc case studies and on the forum, and thought that a marginal likelihood may be the fastest, although, I’m not sure how to best implement it. I had a go as shown below, but please let me know if this is sensible.
import numpy as np
import pandas as pd
import pymc as pm
# Assume that the "true" but unknown values of P and K are
P_true = 0.8 #penetration efficiency
K_true = 0.10 #deposition rate (h-1)
# Create data
data = pd.DataFrame({"measured_ach": np.random.uniform(0.75, 1.25, 250)})
data["measured_i_o"] = data["measured_ach"].apply(lambda ach: model(ach, P=P_true, K=K_true))
sigma_par = 0.01
data["measured_i_o_noisy"] = data["measured_i_o"] + np.random.normal(loc=0, scale=sigma_par, size=len(data))
# Subset data for training
N = 49
y_train = data.loc[0:N,'measured_i_o_noisy'].to_numpy()[:, None]
ach_train = data.loc[0:N,"measured_ach"].to_numpy()[:, None]
class BiasedModel(pm.gp.mean.Mean):
def __init__(self, P, K, ach):
self.P = P
self.K = K
self.ach = ach
def __call__(self, ach):
return ((ach * P) / (ach + K) + (ach*0.015))
with pm.Model() as model_bias_marginal1:
# Assume exponential prior with rate (or inverse scale) of 10
sigma = pm.Exponential('sigma', lam=10)
# Assume normal prior with mean of 0.85, standard deviation of 0.2
P = pm.TruncatedNormal('P', mu=0.85, sigma=0.2, lower=0)
# Assume normal prior with mean of 0.12, standard deviation of 0.05
K = pm.TruncatedNormal('K', mu=0.12, sigma=0.05, lower=0)
# Note, in pymc doc rho is equivalent to l
rho = pm.InverseGamma("rho", alpha=5, beta=5)
eta = pm.HalfNormal("eta", sigma=2)
# Covariance function for the model bias GP
cov = eta**2 * pm.gp.cov.ExpQuad(1, rho)
# Specify biased model output as mean to the GP marginal
gp = pm.gp.Marginal(mean_func= BiasedModel(P, K, ach_train), cov_func=cov)
likelihood = gp.marginal_likelihood('y_obs', sigma=sigma, y=y_train, X=ach_train)
with model_bias_marginal1:
# draw 500 posterior samples
idata_model_bias_marginal1 = pm.sample(tune=500, draws=500, cores=2, chains=2, random_seed=456)
Another approach, which is likely slower, but seems more similar to my original model is:
def BiasedModel(ach, P, K):
return ((ach * P) / (ach + K)) + (ach*0.015)
# MvNormal likelihood implementation
with pm.Model() as model_bias_mvn:
###### Parameter Priors ##########
# Assume exponential prior with rate (or inverse scale) of 10
sigma = pm.Exponential('sigma', lam=10)
# Assume normal prior with mean of 0.85, standard deviation of 0.2
P = pm.TruncatedNormal('P', mu=0.85, sigma=0.2, lower=0)
# Assume normal prior with mean of 0.12, standard deviation of 0.05
K = pm.TruncatedNormal('K', mu=0.12, sigma=0.05, lower=0)
###### Hyperparameter Priors ######
# Note, in pymc doc rho is equivalent to l
rho = pm.InverseGamma("rho", alpha=5, beta=5)
eta = pm.HalfNormal("eta", sigma=2)
###### Model prediction #####
mu = pm.Deterministic("mu", BiasedModel(ach_train, P, K))
# Define covariance function for the model bias and noise
cov_delta = eta**2 * pm.gp.cov.ExpQuad(1, rho)
cov_epsilon = pm.gp.cov.WhiteNoise(sigma)
# Construct single covariance matrix
Sigma_comb = cov_delta(X=ach) + cov_epsilon(X=ach)
y_obs = pm.MvNormal('y_obs', mu=mu, cov=Sigma_comb, observed=y_train)
with model_bias_mvn:
# draw 250 posterior samples
idata_model_bias_mvn = pm.sample(tune=500, draws=250, cores=2, chains=2, random_seed=456)
My questions:
- Any advice on how to best define this model in pymc? Is there an alternative formulation that you would recommend?
- Am I missing something in either of my attempts that could speed things up, or do either of them seem wrong to you?
- In particular, with regard to my second attempt, can I take advantage of cholesky decomposition (or is that already implemented within the MvNormal function)?
I appreciate that I am asking a lot of questions, and providing me with an answer to any of them would be much appreciated!