Either "too many indices for array" or "array must not contain infs or NaNs" error with a GP

Gon_F · May 16, 2019, 8:21pm

So, I have encountered a weird bug when attempting to run a simple Gaussian Process model, and I am quite confused as to what is happening. My setup is Numpy: 1.16.0, Theano: 1.0.4, Pymc3: 3.6, and this problem persists across both my Ubuntu work machine and personal Mac.

First, I will post my code.

My data:

x = np.linspace(0,40, num=300)
noise1 = np.random.normal(0,0.3,300)
y = np.sin(x) + noise1

temp = x[150:]
noise2 = 0.004*temp**2 + np.random.normal(0,0.1,150)
y[150:] = y[150:] + noise2

true_line = np.sin(x)
true_line[150:] = true_line[150:] + 0.004*temp**2

x_sin = x[:150]
y_sin = y[:150]

X_sin = np.expand_dims(x, axis=1)
Y_sin = np.expand_dims(y, axis=1)

test_X_sin_1dim = np.linspace(-20,40,500)
test_X_sin_2dim = np.expand_dims(test_X_sin_1dim, axis=1)

plt.plot(x_sin, y_sin)

Model:

with pm.Model() as gp_model_1:

    period = pm.Normal("period", mu=0, sd=10)

    ℓ_psmooth = pm.Gamma("ℓ_psmooth ", alpha=4, beta=3)

    cov_seasonal = pm.gp.cov.Periodic(1, period, ℓ_psmooth)
    gp_seasonal = pm.gp.Marginal(cov_func=cov_seasonal)

    σ  = pm.HalfNormal("σ", sd=10, testval=5)

    gp = gp_seasonal

    y_ = gp.marginal_likelihood("y", X=x_sin, y=y_sin, noise = σ, shape=1)

If I run the code as is, I receive the familiar “too many indices for array” error, which I proceed to fix by creating an additional dimension for my input by setting X=x_sin[:, None].

However, now I get the newer error of “array must not contain infs or NaNs”, which is strange because my two arrays do not have any infs or NaNs. Adjusting the shape parameter does nothing here.

Would anyone know what is going on?

Hamster_on_wheels · May 17, 2019, 10:57pm

Gist:

import numpy as np
import pymc3 as pm

x = np.linspace(0, 40, num=150)
y = np.sin(x)
# x = x[:, None]
# y = y[:, None]

with pm.Model():
    p = pm.Normal("p", mu=0, sd=10)
    l = pm.Gamma("l", alpha=4, beta=3)
    a = pm.gp.cov.Periodic(1, p, l)
    b = pm.gp.Marginal(cov_func=a)
    s = pm.HalfNormal("s", sd=10, testval=5)
    b.marginal_likelihood(
        "y", X=x, y=y, noise=s, shape=1)

Error:

Traceback (most recent call last):
  File "<...>/test2.py", line 16, in <module>
    "y", X=x, y=y, noise=s, shape=1)
  File "<...>/pymc3/gp/gp.py", line 419, in marginal_likelihood
  File "<...>/pymc3/gp/gp.py", line 380, in _build_marginal_likelihood
  File "<...>/pymc3/gp/cov.py", line 61, in __call__
  File "<...>/pymc3/gp/cov.py", line 291, in full
  File "<...>/pymc3/gp/cov.py", line 70, in _slice
IndexError: too many indices for array

Error with lines un-commented:

Traceback (most recent call last):
  File "<...>/test2.py", line 16, in <module>
    "y", X=x, y=y, noise=s, shape=1)
  File "<...>/pymc3/gp/gp.py", line 424, in marginal_likelihood
  File "<...>/pymc3/distributions/distribution.py", line 41, in __new__
  File "<...>/pymc3/distributions/distribution.py", line 52, in dist
  File "<...>/pymc3/distributions/multivariate.py", line 225, in __init__
  File "<...>/pymc3/distributions/multivariate.py", line 61, in __init__
  File "<...>/theano/gof/op.py", line 674, in __call__
    required = thunk()
  File "<...>/theano/gof/op.py", line 892, in rval
    r = p(n, [x[0] for x in i], o)
  File "<...>/theano/tensor/slinalg.py", line 76, in perform
    z[0] = scipy.linalg.cholesky(x, lower=self.lower).astype(x.dtype)
  File "<...>/scipy/linalg/decomp_cholesky.py", line 91, in cholesky
    check_finite=check_finite)
  File "<...>/scipy/linalg/decomp_cholesky.py", line 19, in _cholesky
    a1 = asarray_chkfinite(a) if check_finite else asarray(a)
  File "<...>/numpy/lib/function_base.py", line 498, in asarray_chkfinite
    "array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

Hamster_on_wheels · May 17, 2019, 11:24pm

Doc says X needs to be a column vector. I guess both x and y need to be column vector?

The first piece of code in the doc doesn’t actually work. The code didnt’ define variable y.

Fix for 1st error:

import numpy as np
import pymc3 as pm

x = y = np.arange(150)[:, None]

with pm.Model():
    a = pm.gp.cov.Periodic(1, 2, 3)
    b = pm.gp.Marginal(cov_func=a)
    b.marginal_likelihood(
        "y", X=x, y=y, noise=3)

But that doesn’t fix the 2nd error.

Hamster_on_wheels · May 17, 2019, 11:59pm

The code seems an adaptation of https://bwengals.github.io/looking-at-the-keeling-curve-with-gps-in-pymc3.html, which probably doesn’t work in pymc3 anymore.

Gon_F · May 18, 2019, 1:31am

The second Mauna Lua example was precisely what I was working with, but yeah that was my guide.

How else would one fit such a Gaussian Process then? The code does not seem too idiosyncratic, only utilizing 3 priors and 1 kernel. I imagine there is probably another way to define the Periodic component outside of pm.gp.Marginal?

Topic		Replies	Views
IndexError: too many indices for array - Non-linear matrix factorization with missing values using Gaussian Processes - using pymc3 v3 theano , bug , gaussian_process , modeling	1	287	April 12, 2023
Multidimensional gaussian process Questions	55	10104	October 22, 2020
Value error in chains for a gaussian process v5	3	578	July 15, 2022
Batched Gaussian Processes Questions	3	563	August 6, 2019
2d classification with GP, problem Questions	2	880	June 21, 2017

Either "too many indices for array" or "array must not contain infs or NaNs" error with a GP

Related topics