Constructing a Gaussian Process Model in PyMCv5

Hello, I’ve encountered an issue while attempting to replicate a Gaussian process model. The original model was constructed as follows use pymc2:

M = pm.gp.Mean(...)
C = pm.gp.Covariance(...)
# Some observations
pm.gp.observe(M, C, r, y, obs_variance)
# Setting initial values
init_vals = pm.gp.Realization(M, C)(self.x)
gp1 = pm.gp.GPSubmodel(..., M, C, self.x, init_vals=init_vals)
...
# gp1, gp2, gp3 are all created using the gp model mentioned above
@pm.deterministic
def func(gp1, gp2, gp3):
    f = gp1.f_eval * (gp2.f_eval - gp3.f_eval)
    return f

u = func

v = pm.Normal(..., mu=u, tau=..., observed=True, value=...)

I’m unsure how to construct the same model using PyMCv5 and would greatly appreciate any suggestions and help. Thank you!

I’ve never used pymc2, so I won’t be able to help you translate directly. I have to guess that you’re creating a GP defined as f = f1 * (f2 - f3)? The documentation on these two pages might be helpful:

Let me know if that page doesn’t answer your questions.

Hi, bwengals, thank you for your reply. I had previously raised a question about the possibility of implementing an ‘cumtrapz’ within the model and received a definitive answer. I have recently revisited this issue and would like to re-edit this question to make it appear more complete. The formula is as follows:

{U(x)=\frac{s}{f_{1}(x)(f_{4}(x)-f_{5}(x))}\int_{0}^{x}f_{1}(x^{\prime})\Big(f_{2}(x^{\prime})-f_{3}(x^{\prime})\Big)dx^{\prime}}

f1-f5~gp, each f with observed values that have a fixed variance., s~Normal, finally I am looking to obtain the posterior of f.
The v2code can be roughly described as:

M = pm.gp.Mean(...)
C = pm.gp.Covariance(...)
# Some observations
pm.gp.observe(M, C, r, y, obs_variance)
# Setting initial values
init_vals = pm.gp.Realization(M, C)(self.x)
gp1 = pm.gp.GPSubmodel(..., M, C, self.x, init_vals=init_vals)
...
# gp1 through gp5 are all created using the gp model mentioned above
@pm.deterministic
def func(gp1, gp2, gp3, gp4, gp5, s, x):
    f = gp1.f_eval * (gp2.f_eval - gp3.f_eval)
    um = cumtrapz(f, x)
    us = s * um / (gp1.f_eval * (gp4.f_eval - gp5.f_eval))
    return us
u = func
v = pm.Normal(..., mu=u, tau=..., observed=True, value=...)
# sample

I have looked briefly at the PyMC2 documentation on Gaussian processes, which states that after pm.gp.observe, M and C are transformed into M0 and C0, in gp1 = pm.gp.GPSubmodel(..., M, C, ...), M and C are actually M0 and C0.

So, what do I need to pass to this ‘cumtrapz’ function? If I use latent, should I define its mean and variance as M0 and M0, for example:

M0 = pm.gp.mean...
C0 = pm.gp.cov...
gp1 = pm.gp.Latent(M0, C0)
f1 = gp.prior("f1", X=x)
# f1 through f5 are obtained by the same procedure
f = f1 * (f2 - f3)
u_m = cumulative_trapezoid_pt(f, r) / (f4 * (f5 - f6))
u_s = pm.Deterministic("u_s", s * u_m)
u = pm.Normal("u", mu=u_s, sigma=10, observed=observe_value)
pm.sample()

Or using marginal , for example:

M = pm.gp.Mean(...)
C = pm.gp.Covariance(...)
gp = pm.gp.Marginal(M, C)
y_ = gp.marginal_likelihood("y", X=r, y=y, sigma=10)
f1 = gp.conditional("f1", x)
##
f = f1 * (f2 - f3)
u_m = cumulative_trapezoid_pt(f, x) / (f1 * (f4 - f5))
u_s = pm.Deterministic("u_s", s * u_m)
u = pm.Normal("u", mu=u_s, sigma=10, observed=observe_value)
pm.sample()

Forgive my immature ideas, my knowledge in this area is quite limited. If you have any further suggestions or guidance, I would greatly appreciate it. Thank you!

From what I understand, the way you’ve got it written with Latent looks correct to me. pm.Marginal assumes a Gaussian likelihood on y, which maybe you have, but Latent is more straightforward to work with. If you need more speed (and X is one or two dimensional), check out pm.gp.HSGP.

1 Like

Hi, bwengals, thank you very much for your reply, I will give it a try!