Either "too many indices for array" or "array must not contain infs or NaNs" error with a GP

So, I have encountered a weird bug when attempting to run a simple Gaussian Process model, and I am quite confused as to what is happening. My setup is Numpy: 1.16.0, Theano: 1.0.4, Pymc3: 3.6, and this problem persists across both my Ubuntu work machine and personal Mac.

First, I will post my code.

My data:

x = np.linspace(0,40, num=300)
noise1 = np.random.normal(0,0.3,300)
y = np.sin(x) + noise1

temp = x[150:]
noise2 = 0.004*temp**2 + np.random.normal(0,0.1,150)
y[150:] = y[150:] + noise2

true_line = np.sin(x)
true_line[150:] = true_line[150:] + 0.004*temp**2

x_sin = x[:150]
y_sin = y[:150]

X_sin = np.expand_dims(x, axis=1)
Y_sin = np.expand_dims(y, axis=1)

test_X_sin_1dim = np.linspace(-20,40,500)
test_X_sin_2dim = np.expand_dims(test_X_sin_1dim, axis=1)

plt.plot(x_sin, y_sin)

Model:

with pm.Model() as gp_model_1:

    period = pm.Normal("period", mu=0, sd=10)

    ℓ_psmooth = pm.Gamma("ℓ_psmooth ", alpha=4, beta=3)

    cov_seasonal = pm.gp.cov.Periodic(1, period, ℓ_psmooth)
    gp_seasonal = pm.gp.Marginal(cov_func=cov_seasonal)

    σ  = pm.HalfNormal("σ", sd=10, testval=5)

    gp = gp_seasonal

    y_ = gp.marginal_likelihood("y", X=x_sin, y=y_sin, noise = σ, shape=1)

If I run the code as is, I receive the familiar “too many indices for array” error, which I proceed to fix by creating an additional dimension for my input by setting X=x_sin[:, None].

However, now I get the newer error of “array must not contain infs or NaNs”, which is strange because my two arrays do not have any infs or NaNs. Adjusting the shape parameter does nothing here.

Would anyone know what is going on?

1 Like

Gist:

import numpy as np
import pymc3 as pm

x = np.linspace(0, 40, num=150)
y = np.sin(x)
# x = x[:, None]
# y = y[:, None]

with pm.Model():
    p = pm.Normal("p", mu=0, sd=10)
    l = pm.Gamma("l", alpha=4, beta=3)
    a = pm.gp.cov.Periodic(1, p, l)
    b = pm.gp.Marginal(cov_func=a)
    s = pm.HalfNormal("s", sd=10, testval=5)
    b.marginal_likelihood(
        "y", X=x, y=y, noise=s, shape=1)

Error:

Traceback (most recent call last):
  File "<...>/test2.py", line 16, in <module>
    "y", X=x, y=y, noise=s, shape=1)
  File "<...>/pymc3/gp/gp.py", line 419, in marginal_likelihood
  File "<...>/pymc3/gp/gp.py", line 380, in _build_marginal_likelihood
  File "<...>/pymc3/gp/cov.py", line 61, in __call__
  File "<...>/pymc3/gp/cov.py", line 291, in full
  File "<...>/pymc3/gp/cov.py", line 70, in _slice
IndexError: too many indices for array

Error with lines un-commented:

Traceback (most recent call last):
  File "<...>/test2.py", line 16, in <module>
    "y", X=x, y=y, noise=s, shape=1)
  File "<...>/pymc3/gp/gp.py", line 424, in marginal_likelihood
  File "<...>/pymc3/distributions/distribution.py", line 41, in __new__
  File "<...>/pymc3/distributions/distribution.py", line 52, in dist
  File "<...>/pymc3/distributions/multivariate.py", line 225, in __init__
  File "<...>/pymc3/distributions/multivariate.py", line 61, in __init__
  File "<...>/theano/gof/op.py", line 674, in __call__
    required = thunk()
  File "<...>/theano/gof/op.py", line 892, in rval
    r = p(n, [x[0] for x in i], o)
  File "<...>/theano/tensor/slinalg.py", line 76, in perform
    z[0] = scipy.linalg.cholesky(x, lower=self.lower).astype(x.dtype)
  File "<...>/scipy/linalg/decomp_cholesky.py", line 91, in cholesky
    check_finite=check_finite)
  File "<...>/scipy/linalg/decomp_cholesky.py", line 19, in _cholesky
    a1 = asarray_chkfinite(a) if check_finite else asarray(a)
  File "<...>/numpy/lib/function_base.py", line 498, in asarray_chkfinite
    "array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs
1 Like

Doc says X needs to be a column vector. I guess both x and y need to be column vector?

The first piece of code in the doc doesn’t actually work. The code didnt’ define variable y.

Fix for 1st error:

import numpy as np
import pymc3 as pm

x = y = np.arange(150)[:, None]

with pm.Model():
    a = pm.gp.cov.Periodic(1, 2, 3)
    b = pm.gp.Marginal(cov_func=a)
    b.marginal_likelihood(
        "y", X=x, y=y, noise=3)

But that doesn’t fix the 2nd error.

1 Like

The code seems an adaptation of https://bwengals.github.io/looking-at-the-keeling-curve-with-gps-in-pymc3.html, which probably doesn’t work in pymc3 anymore.

The second Mauna Lua example was precisely what I was working with, but yeah that was my guide.

How else would one fit such a Gaussian Process then? The code does not seem too idiosyncratic, only utilizing 3 priors and 1 kernel. I imagine there is probably another way to define the Periodic component outside of pm.gp.Marginal?