Two gp.Marginals in single pm.Model - which likelihood pymc3 uses for sampling?

Hello everyone,
I have the following “toy code” of a more complex problem that I am currently implementing using pymc3 with two gp.Marginal objects in the definition of a single pm.Model(). Are the gp.Marginals considered as independent? i.e. what is the likelihood used for sampling from the posterior in the following example:

data_1 = np.random.rand(10)
data_2 = np.random.rand(5)
x_1 = np.array([i for i in range(10)])[:,None]
x_2 = np.array([i for i in range(5)])[:,None]

with pm.Model() as toy_model:
    theta = pm.Gamma("theta", alpha =10, beta= 1)
    eta = pm.Gamma("eta", alpha =10, beta= 1)
    sigma = pm.InverseGamma("sigma", alpha = 10, beta = 1)
    
    mean_f = pm.gp.mean.Zero()
    mean_delat = pm.gp.mean.Zero()
    
    cov_f = pm.gp.cov.ExpQuad(1, ls = theta)
    cov_delat = pm.gp.cov.ExpQuad(1, ls = eta)
    
    f = pm.gp.Marginal(mean_func=mean_f, cov_func = cov_f)
    delta = pm.gp.Marginal(mean_func=mean_delat, cov_func = cov_delat)
    observables = f + delta
    
    y_engine = f.marginal_likelihood("y_engine", X = x_1, y = data_1, noise=sigma)
    y_obs = observables.marginal_likelihood("y_obs", X = x_2, y = data_2, noise=sigma)
    trace = pm.sample(1000, chains=1)

Are the posteriors being sampled from:
p(\theta, \eta, \sigma| data_1, data_2) \propto p(data_1, data_2| \theta, \eta, \sigma)p(\theta, \eta, \sigma)

with

(data_1, data_2)^T \sim N(0, \Bigg(\begin{matrix} C_1(\theta) & C_{12}(\theta, \eta)\\ C_{21}(\theta, \eta) & C_2(\theta, \eta)\end{matrix}\Bigg))

or

(data_1, data_2)^T \sim N(0, \Bigg(\begin{matrix} C_1(\theta) & 0\\ 0& C_2(\theta, \eta)\end{matrix}\Bigg))

(I dropped the dependency on \sigma for simplicity). I will be grateful for any answers, because I need to make sure that my code uses the first likelihood for the posterior sampling.
Thanks!

It is the second case as there is no correlation term between data_1 and data_2 in the model.

1 Like

Is there a way to sample according to the first one? What would be the most effective way to introduce the correlation?

Maybe a Kronecker Structured Covariances? @bwengals?

Hmm, I am not sure that it is applicable, since the convariance matrix for (data_1,data_2)^T does not have a Kroneker structure. I am more or less now looking for any kind of way to implement this scenario with pymc3

I was thinking a little more about the no correlation term. I am defining data_1 \sim GP_f(x) and data_2 \sim GP_f(x') + GP_{delta}(x'). So assuming GP_f and GP_{delta} independent, I should have cov(data_1,data_2) = cov(GP_f(x),GP_f(x')) and get a non-zero cross terms in the likelihood. Is there something that I am missing about the way gp.Marginal works? Should I be using gp.Latent instead?

I think what you are doing in your original code is exactly what you want to do. As you outlined the problem, it looks like you don’t need to explicitly model the off diagonal block C_{12}. There should be nonzero correlation here since data_1 ~ GP_1, data_2 = GP_1 + GP_{delta}.

The only funky thing is adding the two marginals, where both are observed. If f were Latent I think you’d be fine for sure. You should double check whether both variations give you the same result.

Thanks, I will try that:) Is there a way to print a cov matrix associated with the whole model (at given test point)? I know I can do it for a specific gp through gp.predic, but I was wondering if there is something similar for the whole object.

No, nothing built in for that. You’d have to math it out and put it as a deterministic that you track while sampling.

1 Like