Mean of T process

aplavin · October 2, 2017, 8:39am

Hi!
I use gaussian processes to model a timeseries like this (see image row 2).
Here blue dots are measurements and GP fit is shown with orange line and 68 and 95% regions.
I use an unknown constant mean function c and as you can see in this traceplot of 3 chains (see image row 1) - the constant value is very reasonable - about 0.5-0.6.

Then I fit a T process to compare results. The predictions mean is similar to that of GP, and uncertainties seem even better (see image row 4).

However, at the same time the mean value seems off (see image row 3).
Here its posterior is very spread and closer to 1. The priors are the same in both cases (c ~ HalfNormal(0.5)).

Is this something specific to T processes, or their implementation in pymc?

Here are all plots in a single image as I can’t upload them separately here:

junpenglao · October 2, 2017, 10:23am

Did you try to use a zero mean function? Most of the case it doesn’t make a big difference where you have data, the difference would be where you are trying to make a prediction where it is far away from your observation (e.g., in your case if you try to predict value at x=100 it would be a Gaussian with mean=c). But I agree that the mean value seems off. Could you please post your code and data?

aplavin · October 2, 2017, 1:28pm

Yes, I tried zero mean. But in some cases I fit linear mean function and need its parameters - when using T processes they are entirely off. Will post code and data, but the behavior is similar for basically any data.

bwengals · October 2, 2017, 6:46pm

That is an odd result, since the T process is implemented similarly to gp.Latent. You could also try a GP with a Student T likelihood. I think it’s good to use a constant in a case like this.

What happens if you divide your data by the mean before fitting the T process? Does the constant parameter still get dragged too high?

aplavin · October 3, 2017, 3:14pm

I forgot to mention, but I used GP with student likelihood, same as TP.

And here is the whole code:

with pm.Model() as model:
    c = pm.Normal('c', mu=0, sd=0.5)
    η = pm.HalfNormal("η", sd=1)
    ℓ = pm.HalfNormal("ℓ", sd=0.5)
    ν = pm.Gamma("ν", alpha=2, beta=0.1, shape=2)

    cov = η**2 * pm.gp.cov.ExpQuad(1, ℓ)
    gp = pm.gp.Latent(mean_func=pm.gp.mean.Constant(c),
                      cov_func=cov)
    # comment the above line and uncomment below to switch from GP to TP
#     gp = pm.gp.TP(mean_func=pm.gp.mean.Constant(c),
#                   cov_func=cov,
#                   nu=ν[0])
    f = gp.prior('f', X=x_train[:, None])

    σ = pm.HalfNormal("σ", sd=0.5)
    y_ = pm.StudentT("y", mu=f, lam=1.0/σ, nu=ν[1], observed=y_train)

    trace = pm.sample(njobs=3)

    x_pred = np.linspace(x_train.min() - 1, x_train.max() + 1, 100)
    y_pred = gp.conditional('y_pred', Xnew=x_pred[:, None])
    samples = pm.sample_ppc(trace, vars=[y_pred], samples=1000)

and then

plt.plot(x_train, y_train, '.')
l, = plt.plot(x_pred, np.median(samples['y_pred'], 0))
plt.fill_between(x_pred, *np.percentile(samples['y_pred'], [16, 84], 0), alpha=0.4, color=l.get_color())
plt.fill_between(x_pred, *np.percentile(samples['y_pred'], [2.5, 97.5], 0), alpha=0.2, color=l.get_color())

to get the prediction plots from the first post. The data:

x_train = [ -2.91991786,  -2.75564682,  -2.65434634,  -2.62149213,
        -2.44079398,  -2.04106776,  -1.89322382,  -1.71526352,
        -1.52361396,  -1.39493498,  -1.25256674,  -1.03080082,
        -0.71594798,  -0.64750171,  -0.03422313,   0.0807666 ,
         0.19575633,   0.38740589,   0.80903491,   1.19233402,
         1.26899384,   1.50718686,   1.82477755,   2.04106776,
         2.04106776,   2.17522245,   2.34770705,   2.55852156,
         2.73100616,   3.19096509,   3.34428474,   3.45927447,
         3.51676934,   3.70841889,   5.51540041,   6.5229295 ,
         6.96098563,   7.48117728,   7.5797399 ,   7.9247091 ,
         8.36550308,  10.47364819,  13.02258727]
y_train = [ 0.70423359,  0.72200438,  0.77554082,  0.70194589,  0.64435481,
        0.99989352,  0.66998084,  0.55096063,  0.50594506,  0.54125961,
        0.56752747,  0.37616882,  0.43905362,  0.5621031 ,  0.47422528,
        0.51995446,  0.47248769,  0.59128824,  0.69214369,  0.55897017,
        0.66306419,  0.61727744,  0.56153796,  0.50519954,  0.46369249,
        0.45691936,  0.60405043,  0.44529634,  0.52326453,  0.28551515,
        0.37393222,  0.6092491 ,  0.52784151,  0.61273912,  0.64628011,
        0.5829985 ,  0.76831989,  0.17223425,  1.25646104,  0.81781078,
        0.44856902,  0.55787261,  0.33272409]

aplavin · October 3, 2017, 3:32pm

As for dividing my mean - surprisingly, it gave about the same value for c (a bit lower than 1). I’m not very familiar with T processes and can’t guess what’s the possible reasons are, what do you think?

junpenglao · October 4, 2017, 7:29am

Hmmm, I cannot see any obvious problem here, but there is a clear bias in gp.TP. @bwengals?

bwengals · October 4, 2017, 10:52pm

I’m not spotting anything obvious either. Will look into it further and get back… Thanks for posting your code+data!

Topic		Replies	Views
Gaussian process regression: samples from mean model Questions	1	376	August 25, 2020
Gaussian Process predict over trace Questions	2	812	September 4, 2018
Gaussian Process -Statistical rethinking v5 gaussian_process	2	835	January 28, 2023
Posterior predictive checks with Gaussian Process	4	1286	May 26, 2022
Using Marginal.predict method with GP trace InferenceData v5 gaussian_process	2	467	May 21, 2022

Mean of T process

Related topics