IndexError: too many indices for array - Non-linear matrix factorization with missing values using Gaussian Processes - using pymc3

Hi everyone,

I am currently implementing the following non-linear matrix factorization (MF) approach by N. D. Lawrence et. al.. And I am getting errors for how to set up the correct model also because the matrix I want to factorize and predict has missing values.

I’ve previously implemented the PMF approach by Salakhutdinov and Mnih. Which worked well using the following model:

    # data and mask of missing values
    nan_mask_matrix = np.isnan(self.data_matrix)
    self.data_matrix[nan_mask_matrix] = self.data_matrix[~nan_mask_matrix].mean()
    self.model = pm.Model()

    with self.model as pmf:

        W = pm.MvNormal(
            "W",
            mu=0,
            tau=self.alpha_w * np.eye(dim),
            shape=(n, dim),
            testval=np.random.randn(n, dim) * std,
        )

        H = pm.MvNormal(
            "H",
            mu=0,
            tau=self.alpha_h * np.eye(dim),
            shape=(m, dim),
            testval=np.random.randn(m, dim) * std,
        )

        R = pm.Normal("R", mu=(W @ H.T)[~nan_mask_matrix], tau=self.alpha, observed=self.data_matrix[~nan_mask_matrix])

Now, when trying to adapt my approach to non-linear MF, my idea is to replace the mean of R, the conditional distribution over observed values, by a matrix F consisting of samples from a Gaussian process (GP) prior (instead of being a matrix multiplication between W and H.T).

The current mean has dimensions: mdim times ndim. Now the goal is that the function F will eventually have ndim priors of length mdim. Starting with just one prior, therefore changing R to

r = pm.Normal(“r”, mu=f[~nan_mask_one_column], tau=self.alpha, observed=self.data_one_column[~nan_mask_one_column])

following Implementations works well, using this piece of code:

    self.data_one_column = self.data_matrix[1,:]
    nan_mask_one_column = nan_mask_matrix[1,:]

    with self.model as pmf:
        # Parameters of GP
        lengthscale = 0.1
        cov = pm.gp.cov.ExpQuad(ls=lengthscale, input_dim=1)

        # Add white noise to stabilise
        cov += pm.gp.cov.WhiteNoise(1e-6)

        # GP latent variable model, assume 0 mean
        gp = pm.gp.Latent(cov_func=cov)

        # Input x to create the GP prior of dimension m
        x = np.linspace(0, 1, m)[:, None]
        f = gp.prior("f", X=x)

        # Marginal Likelihood
        r = pm.Normal("r", mu=f[~nan_mask_one_column], tau=self.alpha, observed=self.data_one_column[~nan_mask_one_column])

Now, the problem occurs when I try to extend it to a matrix.
This is my current attempt:

    with self.model as pmf:
        # Parameters of GP
        # m: Number of GP priors
        lengthscale = list(0.1*np.ones((m,)))
        cov = pm.gp.cov.ExpQuad(ls=lengthscale, input_dim=m, active_dims=list(range(0, m)))

        # Add white noise to stabilise
        cov += pm.gp.cov.WhiteNoise(1e-6)

        # GP latent variable model, assume 0 mean
        gp = pm.gp.Latent(cov_func=cov)

        # Input x to create the GP priors of needed dimension
        x = np.linspace(0, 1, m)[:, None]
        for j in np.arange(1, m):
            x2 = np.linspace(0, 1, m)[:, None]
            x = np.hstack((x, x2))

        F = gp.prior("F", X=x)

        # Regression model
        R = pm.Normal("R", mu=F.T[~nan_mask], tau=self.alpha, observed=self.data[~nan_mask], init=pm.Normal.dist(mu=0.0, sd=1.0), shape=(m,m))

And the error happens in R:

R = pm.Normal("R", mu=F.T[~nan_mask], tau=self.alpha, observed=self.data[~nan_mask], init=pm.Normal.dist(mu=0.0, sd=1.0), shape=(m,m))

being:

raise IndexError("too many indices for array")
 IndexError: too many indices for array

Whould anyone know how to adjust R? Or is there a bigger mistake that I am making before trying to compute R?
Thanks in advance and happy Easter break:)

:slight_smile: Just in case someone is interested or encounters a similar problem:
Solved by giving the GP prior the right shape and removing the transpose.

  • I am not sure yet, whether the logic is correct, which I am currently verifying, but the code runs and debugging is possible.