I have a model that appears to work in which I’m modelling my response variable as a GP.
Here is its spec:
y = df_daily_small_impute['amine_degradation']/df_daily_small_impute['total_hrs'].values
X = df_daily_small_impute.index.values.reshape(-1,1)
with pm.Model() as model_gp_amine:
y_obs = pm.MutableData('y_obs', y)
# params
sigma = pm.Exponential('sigma', 1)
ℓ = pm.Gamma("ℓ", alpha=2, beta=2)
η = pm.HalfCauchy("η", beta=3)
# Define the mean and covariance function
mean_func = pm.gp.mean.Zero()
cov_func = η**2 * pm.gp.cov.ExpQuad(1, ℓ)
# Define the Gaussian Process with the mean function
gp = pm.gp.Marginal(mean_func=mean_func, cov_func=cov_func)
# Define the observed var
obs = gp.marginal_likelihood("y", X=X, y=y_obs, noise=sigma)
idata_gp_amine = pm.sample()
From here, I’m able to call:
X_new = np.arange(0,180).reshape(-1,1)
with model_gp_amine:
y_pred = gp.conditional("y_pred", X_new)
with model_gp_amine:
pred_samples = pm.sample_posterior_predictive(idata_gp_amine, var_names=['y_pred'])
and I’m able to build pred_samples with no issues.
However, I’m now trying to include another GP of a covariate (no2) so that I might impute some missing values within that covariate. Here is the spec for the new model:
x_no2 = df_daily_small_impute['Nitrogen dioxide NO2_mean'].values
y = (df_daily_small_impute['amine_degradation']/df_daily_small_impute['total_hrs']).values
X = df_daily_small_impute.index.values.reshape(-1,1)
# mean function
class MeanDegradation(pm.gp.mean.Linear):
def __init__(self,
b_no2,
no2,
):
self.b_no2 = b_no2
self.no2 = no2
def __call__(self, X):
return self.b_no2*no2
with pm.Model() as model_gp_amine_no2:
y_obs = pm.MutableData('y_obs', y)
# no2 priors
sigma_no2 = pm.Exponential('sigma_no2', 1)
ℓ_no2 = pm.Gamma("ℓ_no2", alpha=2, beta=2)
η_no2 = pm.HalfCauchy("η_no2", beta=3)
cov_func_no2 = η_no2**2 * pm.gp.cov.ExpQuad(1, ℓ_no2)
# no2 model
mean_func_no2 = pm.gp.mean.Zero()
gp_no2 = pm.gp.Marginal(mean_func=mean_func_no2, cov_func=cov_func_no2)
no2 = gp_no2.marginal_likelihood('no2', X=X, y=x_no2, noise=sigma_no2)
# amine priors
sigma_amine = pm.Exponential('sigma', 1)
b_no2 = pm.Normal('b_no2', 0, 1)
ℓ_amine = pm.Gamma("ℓ_amine", alpha=2, beta=2)
η_amine = pm.HalfCauchy("η_amine", beta=3)
cov_func_amine = η_amine**2 * pm.gp.cov.ExpQuad(1, ℓ_amine)
# amine model
mean_func_amine = MeanDegradation(b_no2=b_no2, no2=no2)
gp_amine = pm.gp.Marginal(mean_func=mean_func_amine, cov_func=cov_func_amine)
amine = gp_amine.marginal_likelihood("y", X=X, y=y_obs, noise=sigma_amine)
idata_gp_amine_no2 = pm.sample(50)
My intuition is that I should be defining the conditional distributions for both my y(amine) variable, and my no2 variable using each of their gp objects respectively, so I’ve been calling:
X_new = np.arange(0,180).reshape(-1,1)
with model_gp_amine_no2:
y_pred = gp_amine.conditional('y_pred', X_new)
no2_pred = gp_no2.conditional('no2_pred', X_new)
This results in the error:
ValueError: Could not broadcast dimensions. Incompatible shapes were [(TensorConstant(TensorType(int64, shape=()), data=array(45)),), (TensorConstant(TensorType(int64, shape=()), data=array(180)),)].
Predictably, get another error when I attempt to build pred_samples:
ValueError: Incompatible shapes for gemv (beta * y + alpha * dot(A, x)). y: (45,), A: (180, 45), x: (45,)
Apply node that caused the error: Gemv{inplace}(Composite{...}.0, 1.0, Transpose{axes=[1, 0]}.0, SolveTriangular{trans=0, unit_diagonal=False, lower=True, check_finite=True, b_ndim=1}.0, 1.0)
Toposort index: 71
Inputs types: [TensorType(float64, shape=(None,)), TensorType(float64, shape=()), TensorType(float64, shape=(180, 45)), TensorType(float64, shape=(None,)), TensorType(float64, shape=())]
Inputs shapes: [(45,), (), (180, 45), (45,), ()]
Inputs strides: [(8,), (), (360, 8), (8,), ()]
Inputs values: ['not shown', array(1.), 'not shown', 'not shown', array(1.)]
Outputs clients: [[SpecifyShape(Gemv{inplace}.0, 180)]]
A clue. If I swap out gp_amine and gp_no2 in my more complex model for the gp fitted in my old model, simply called “gp”, then I don’t have any issues and it works fine, despite producing horrible predictions.
I hope someone can help me determine what I’m doing wrong here because I’m at my wit’s end right now.
Thanks