Shape mismatch when running predictions with Gaussian Processes

aabugaev · March 11, 2025, 11:49am

Hello, PyMC-community!

I am trying to build a model, one of the components being Gaussian Process.

However, once I am trying to get out-of-sample predictions (pm.sample_posterior_predictive(idata)), I am getting some shape mismatch error:
Input dimension mismatch: (input[1].shape[0] = 5, input[2].shape[0] = 10)

I previously got acquainted to some solutions in threads related to shape mismatch, when using pm.sample_posterior_predictive(), (e.g here and here) like adding dims, changing shapes for all variables, setting shape=mu.shape for observed, but they do not seem to work for me.

The code is here:

import numpy as np 
import pymc as pm

integer_time = np.arange(0,10)
future_integer_time = np.arange(0,5)

with pm.Model() as model:
    
    x = pm.MutableData(name="x", value=[1,5,6,4,2,3,4,7,8,9], dims='obs_id')
    y = pm.MutableData(name="y", value=[11,52,61,43,22,34,41,76,83,92], dims='obs_id') # y = 10*x + noise, here always positive

    beta = pm.Normal("beta", mu=0, sigma=500)
    sigma = pm.HalfNormal("sigma", sigma=500)

    eta = pm.HalfNormal(name=f'eta', sigma=10) 
    ls = pm.HalfNormal(name=f'ls', sigma=10)

    cov_func = eta**2 * pm.gp.cov.Matern52(1, ls=ls)
    gp = pm.gp.HSGP(m=[30], c=1.2, cov_func=cov_func)
    gp_rv = gp.prior('gp_rv',integer_time[:,None])
    
    mu = pm.Deterministic('mu', beta * x + gp_rv, dims='obs_id')
    obs = pm.Normal("obs", mu=mu, sigma=sigma, observed=y, dims='obs_id') # also tried to add shape=mu.shape, but does not seem to work
    
    idata = pm.sample()

# Want to get mu predictions for new data
with model:
    pm.set_data( {'x': [4,7,3,9,1]} ) # Also tried adding 'y': [0,0,0,0,0]
    gp_rv_pred = gp.conditional("gp_rv_pred", Xnew=future_integer_time[:, None])
    posterior_pred = pm.sample_posterior_predictive(idata)

Thanks so much in advance!
PS. Using 5.19.0, but tried later versions.

Dekermanjian · March 11, 2025, 2:08pm

Hey @aabugaev, have you tried making integer_time[:, None] a pm.Data() object and then update that in your pm.set_data() with future_integer_time[:, None]?

Also, this example is a really good reference when using HSGP’s.

brontidon · March 11, 2025, 10:21pm

Hi, from reading your post, it looks like you may be after some sort of post training validation-f if i misread i apologize

One thing that has helped me tremendously in my own workflow is to do something very similar to @Dekermanjian’s recommendation: Name your spatial and temporal variables, train your model on your ‘train set’-and then after performing your inference/checks, extend your original data. This notebook: Gaussian Processes: HSGP Advanced Usage — PyMC example gallery has an example of doing that. Here’s the most relevant piece of code after you have trained your original model:

with model:
    pm.set_data({"X": x_full[:, None]})

    idata.extend(
        pm.sample_posterior_predictive(
            idata,
            var_names=["f_mu", "f"],
            predictions=True,
            compile_kwargs={"mode": "NUMBA"},
            random_seed=rng,
        ),
    )

Here, you can set the training data container 'X" to contain all of your points

I realize I’m abstracting away probably the most important bits like defining your kronecker structure, defining hyperparamters ect etc, but if you’re after some code to help you work backwards-this block might serve a good reference!

aabugaev · March 12, 2025, 11:39am

@Dekermanjian @brontidon thanks so much, this helped! Indeed, putting integer_time to MutableData container makes it work.

Here’s the model I ended up with:

with pm.Model() as model:
    
    integer_time = pm.MutableData(name="integer_time", value=np.arange(0,10), dims='obs_id') # make sure you pass time as variable
    x = pm.MutableData(name="x", value=[1,5,6,4,2,3,4,7,8,9], dims='obs_id')
    y = pm.MutableData(name="y", value=[11,52,61,43,22,34,41,76,83,92], dims='obs_id') # y = 10*x + noise, here always positive

    beta = pm.Normal("beta", mu=0, sigma=500)
    sigma = pm.HalfNormal("sigma", sigma=500)

    eta = pm.HalfNormal(name=f'eta', sigma=10) 
    ls = pm.HalfNormal(name=f'ls', sigma=10)

    cov_func = eta**2 * pm.gp.cov.Matern52(1, ls=ls)
    gp = pm.gp.HSGP(m=[30], c=1.2, cov_func=cov_func)
    gp_rv = gp.prior('gp_rv',integer_time[:,None])
    
    mu = pm.Deterministic('mu', beta * x + gp_rv, dims='obs_id')
    obs = pm.Normal("obs", mu=mu, sigma=sigma, observed=y, dims='obs_id')
    
    idata = pm.sample()

# Want to get mu predictions for new data
with model:
    pm.set_data({"integer_time": np.arange(0,15), "x": [1,5,6,4,2,3,4,7,8,9,  12,8,5,4,2]}) # reset integer time here
    gp_rv_pred = gp.conditional("gp_rv_pred", integer_time[:, None])
    idata.extend(pm.sample_posterior_predictive(idata, var_names=['obs','gp_rv_pred'] ))

Also while looking for solutions, I found @juanitorduz 's code samples which also put it to work, but seem to create additional containers instead (source):

with gp_pymc_model:
    x_star_data = pm.MutableData("x_star_data", x_test)
    f_star = gp.conditional("f_star", x_star_data[:, None])
    pm.Normal("likelihood_test", mu=f_star, sigma=noise)

Dekermanjian · March 12, 2025, 1:13pm

Hey @aabugaev, I am happy to have helped. I just want to add a couple of things. Here is your original code modified with some commentary:

import numpy as np 
import pymc as pm

integer_time = np.arange(0,10)
future_integer_time = np.arange(0,5)

with pm.Model() as model:
    # PyMC Data class is now by default mutable no need to call pm.MutableData()
    x = pm.Data(name="x", value=[1,5,6,4,2,3,4,7,8,9], dims='obs_id')
    y = pm.Data(name="y", value=[11,52,61,43,22,34,41,76,83,92], dims='obs_id') # y = 10*x + noise, here always positive
    integer_time = pm.Data(name="integer_time", value=integer_time[:,None])

    beta = pm.Normal("beta", mu=0, sigma=500)
    sigma = pm.HalfNormal("sigma", sigma=500)

    eta = pm.HalfNormal(name=f'eta', sigma=10) 
    ls = pm.HalfNormal(name=f'ls', sigma=10)

    cov_func = eta**2 * pm.gp.cov.Matern52(1, ls=ls)
    gp = pm.gp.HSGP(m=[30], c=1.2, cov_func=cov_func)
    gp_rv = gp.prior('gp_rv', X=integer_time)
    
    mu = pm.Deterministic('mu', beta * x + gp_rv, dims='obs_id')
    obs = pm.Normal("obs", mu=mu, sigma=sigma, observed=y, dims='obs_id') # also tried to add shape=mu.shape, but does not seem to work
    
    idata = pm.sample()

# Want to get mu predictions for new data
with model:
    # When using HSGP and using pm.set_data() for out of sample predictions there is no need
    # to explicitly call gp.conditional(). Also, if you provide the argument predictions=True to 
    # pm.sample_posterior_predictive() then you don't need to supply your y variable in pm.set_data()
    pm.set_data( {'x': [4,7,3,9,1], 'integer_time': future_integer_time[:, None]} ) # Also tried adding 'y': [0,0,0,0,0]
    posterior_pred = pm.sample_posterior_predictive(idata, predictions=True)

I hope this helps!

aabugaev · March 12, 2025, 1:36pm

Thank you!

Having shortly looked into the article you recommended, I got the idea that avoiding conditional is only possible when using the linearized version of HSGP.
Also thanks for elaborating about the predictions.

Largely appreciate your help!

Topic		Replies	Views
Shape error when making out-of-sample predictions version agnostic shape_issue , prediction	13	395	January 10, 2024
"shape mismatch" when new data is set as a predictor for sample_posterior_predictive v5 prediction	2	2825	November 20, 2022
Shape mismatch sample posterior predictive with Binomial v5 shape_issue	2	316	November 15, 2023
Multidimensional input using Gaussian Process Questions	6	3994	June 28, 2017
Time-series regression - shape error / Input dimension mis-match Questions time_series	8	1145	August 6, 2022

Shape mismatch when running predictions with Gaussian Processes

Related topics