Multidimensional input using Gaussian Process

Hi All,

I’m new to pymc3 and I may be making a stupid mistake but I am trying to build a Gaussian Process Regression model and feed in a 2d input. I have adapted my example from here:

I have used PCA to preprocess my input and built the following model:

number_of_pcs = 2

with pm.Model() as model:

    z = pm.Gamma('z', 1, 1, shape=number_of_pcs)
    nu = pm.Gamma('nu', 1, 1, shape=number_of_pcs)
    K = nu * pm.gp.cov.ExpQuad(number_of_pcs, z)

    mu = pm.gp.mean.Zero()
    sigma = pm.HalfCauchy('sigma', 2.5)

    x = features.iloc[:500, :number_of_pcs].values
    y = y_df.values[:500]

    y_obs = pm.gp.GP('y_obs', mean_func=mu, cov_func=K, sigma=sigma, observed={'X': x, 'Y': y})

However, I get the following error:

ValueError: Input dimension mis-match (input[0].shape[1] = 500, input[1].shape[1] = 2)

(using X=x, observed=y gives the same error)

This works when number_of_pcs = 1, but fails if > 1.

Any ideas on how I should proceed? Or any examples of a similar model that I can try to learn from?

Thank you in advance.

Hey there, this works for me in 1 or higher dimensions:

y_obs = pm.gp.GP("y_obs", mean_func=mu, cov_func=K, X=x, sigma=sigma, observed={'X': x, 'Y': y})

Notice the addition of X=x. Does this syntax work for you? I’m on master and I’m seeing the same behavior as you. I’ll see about a patch for this asap. Thank you for posting this!

Thanks @bwengals,

Unfortunately, that does not work for me it still fails with the same error message (3.1 master). I wasn’t sure it was an issue previously (I thought maybe I had done something wrong), do you want me to open a github issue?

I tracked my issue back to line 122 of gp\cov.py.

There’s a list of two factors returned by merge_factors, but the product cannot be calculated. If I change:

K = nu * pm.gp.cov.ExpQuad(number_of_pcs, z)

to:

K = pm.gp.cov.ExpQuad(number_of_pcs, z)

then it works correctly. I tried to figure out the underlying issue but my knowledge of theano is a bit lacking.

ah! I think see it now.

nu = pm.Gamma('nu', 1, 1, shape=number_of_pcs)
K = nu * pm.gp.cov.ExpQuad(number_of_pcs, z)

nu should be a scalar since its role is to scale the covariance matrix, but you have it defined as a vector of random variables. Try removing the shape=number_of_pcs part.

Oh, ok I see. Thanks for all your help, I understand now (noob mistake…).

No prob! Thanks for posting. I messed with it for quite a while before noticing