Hi all!
I’m trying to use a Gaussian Process as the input for the covariance matrix of an MvNormal likelihood:
with pm.Model() as m:
etasq = 1. + pm.HalfNormal('etasq', 0.25)
rhosq = 3. + pm.HalfNormal('rhosq', 0.25)
cov = etasq * pm.gp.cov.ExpQuad(input_dim=1, ls=rhosq)
gp = pm.gp.Latent(cov_func=cov)
SIGMA = gp.prior('SIGMA', X=Dist_mat ** 2)
a = pm.Normal("a", 0., 1.)
b = pm.Normal("b", 0., 0.5, shape=2)
mu = a + b[0] * M + b[1] * G
B = pm.MvNormal("B", mu, cov=SIGMA, observed=B_)
trace = pm.sample()
But running this prints a ValueError: cov must be two dimensional
, because, indeed, SIGMA.tag.test_value.shape
returns (151,)
.
I’m not very familiar yet with the GP module, so I’m not sure what to make of this Does anyone see a way forward?
Thansk a lot in advance
PS: I wanted to make the post as short as possible, but I can share the data if needed
I think it is working as expected. Here’s an explanation.
-
gp.prior
is putting a N
-dimensional re-parameterized multivariate normal prior where N
are number of samples (not features) in your Dist_mat
array. This means SIGMA
in your model is going to be N
dimensional sample from a multivariate normal. Here, N
is 151
.
I think you want to fit a model on Dist_mat
to get a distribution over B_
observed values. You can do that in the following way.
class MyMean(pm.gp.Mean):
def __init__(self, a, b):
self.a = a
self.b = b
def __call__(self, X):
return self.a + self.b[0]*M + self.b[1]*G
with pm.Model() as m:
etasq = 1. + pm.HalfNormal('etasq', 0.25)
rhosq = 3. + pm.HalfNormal('rhosq', 0.25)
a = pm.Normal("a", 0., 1.)
b = pm.Normal("b", 0., 0.5, shape=2)
cov = etasq * pm.gp.cov.ExpQuad(input_dim=1, ls=rhosq)
mu = MyMean(a, b)
gp = pm.gp.Marginal(mean_func=mu, cov_func=cov)
B = gp.prior('B', X=Dist_mat**2, y=B_)
trace = pm.sample(1000)
I don’t know what M
and G
are in your model but probably you want to make the mean function a function of your inputs X
which is Dist_mat
in your model.
1 Like
Ooh thanks a lot! I forgot the realization of a GP is an MvNormal
I’m gonna try that and keep you posted
Do you know if there is a way to specify the mean function without the custom class though, as I did for the cov func? This is for the port of chapter 14 of Rethinking 2, so I’d like the code to be as simple as possible.
M
, G
and B
are respectively body mass, group size and brain size, and in the book only the covariance function is a function of the phylogenetic distance (Dist_mat
) between species
You can just make a lambda function to represent your mean in a compact way.
lambda X: a + b[0]*M + b[1]*G
The only downside is that you would lose the semantics to combine mean functions to form a new one. As that is not your concern for now, I think a lambda function should work just fine.
1 Like
Actually I went ahead with the custom class: that’s more scalable, shows an example of defining custom mean functions for GPs and is quite classy (see what I did there? )
As this is an MvNormal likelihood, I could use pm.gp.Marginal
, which goes with gp.marginal_likelihood
, not with gp.prior
, IIUC. The full model is now:
class LinearMean(pm.gp.mean.Mean):
def __init__(self, intercept, slopes):
self.intercept = intercept
self.slopes = slopes
def __call__(self, X):
return self.intercept + self.slopes[0] * M + self.slopes[1] * G
with pm.Model() as m14_11:
a = pm.Normal("a", 0., 1.)
b = pm.Normal("b", 0., 0.5, shape=2)
mu = LinearMean(intercept=a, slopes=b)
eta_raw = pm.Normal('eta_raw', 1., 0.25)
rho_raw = pm.Normal('rho_raw', 3., 0.25)
etasq = tt.abs_(eta_raw)
rhosq = tt.abs_(rho_raw)
SIGMA = etasq * pm.gp.cov.ExpQuad(input_dim=1, ls=rhosq)
gp = pm.gp.Marginal(mean_func=mu, cov_func=SIGMA)
sigma = pm.Exponential("sigma", 1.)
B = gp.marginal_likelihood("B", X=Dist_mat ** 2, y=B_, noise=sigma)
trace_14_11 = pm.sample(2000, tune=2000, target_accept=0.9)
If I wanted to make the mean function a function of Dist_mat
, as SIGMA is, I think I could directly use pm.gp.mean.Linear
.
Thanks again for your precious help @tirthasheshpatel!
1 Like