PyMC plot pm.gp.util.plot_gp_dist

davidsuculum · October 6, 2022, 10:05am

Perhaps a conceptual question. In a Gaussian Process I have performed (and went ok, omitted for brevity. I want to represent the uncertainty of 50 predicted new points, so I did:

X_new = np.linspace(np.floor(X.min()-2), np.ceil(X.max()+2), 50)[:,None]

with gp_model:
    f_pr = gp.conditional('f_pr', X_new)
    pred_samples = pm.sample_posterior_predictive(trace, var_names=["f_pr"],
                                                 extend_inferencedata=True,
                                                 predictions=True)

If I ask pred_samples.predictions I get

Dimensions:     (chain: 2, draw: 1000, f_pr_dim_0: 50)
Coordinates:
  * chain       (chain) int64 0 1
  * draw        (draw) int64 0 1 2 3 4 5 6 7 ... 992 993 994 995 996 997 998 999
  * f_pr_dim_0  (f_pr_dim_0) int64 0 1 2 3 4 5 6 7 8 ... 42 43 44 45 46 47 48 49
Data variables:
    f_pr        (chain, draw, f_pr_dim_0) float64 14.97 16.07 ... 40.34 40.38
Attributes:
    created_at:                 2022-10-06T09:12:51.519707
    arviz_version:              0.12.1
    inference_library:          pymc
    inference_library_version:  4.1.7

So 3 dimensions, the last one the number of predicted points, ok. But I don’t know what should I do to represent with pm.gp.util.plot_gp_dist

_, ax = plt.subplots(figsize=(12,5))
pm.gp.util.plot_gp_dist(ax, 
                        np.mean(pred_samples.predictions['f_pr'].stack(samples=("chain", "draw")).values, axis=1),
                        X_new, palette='viridis', plot_samples=False)
ax.plot(X, y, 'ko')

Error here is AxisError: axis 1 is out of bounds for array of dimension 1

I have also tried

pm.gp.util.plot_gp_dist(ax, 
                        pred_samples.predictions['f_pr'],
                        X_new, palette='viridis', plot_samples=False)

With ValueError: 'y1' is not 1-dimensional.

So not clear how to represent with pm.gp.util.plot_gp_dist. Thanks.

OriolAbril · October 11, 2022, 10:57am

The docstring for plot_gp_dist says:

samples: numpy.ndarray
    Array of S posterior predictive sample from a GP.
    Expected shape: (S, X)
x: numpy.ndarray
    Grid of X values corresponding to the samples.
    Expected shape: (X,) or (X, 1), or (1, X)

Therefore, my recommendation would be doing something like:

f_pr = az.extract(pred_samples,var_name="f_pr", group="predictions").transpose("sample", ...)
pm.gp.util.plot_gp_dist(ax, f_pr, X_new, ...)

az.extract defaults to stacking the chain and draw dimensions into the sample one. xarray defaults to adding new dimensions at the end, so to comply with the S, X shape needed for plot_gp_dist you need the transpose.

Note: this for now requires arviz development version until the next release is out (should happen in 1-2 weeks time).

Topic		Replies	Views
2D Gaussian Process: error generating posterior predictive samples v5 bug , gaussian_process	2	555	April 3, 2023
Multivariate Gaussian Processes Questions	2	560	July 27, 2020
Multivariate Gaussian Process Classification (GPC) and Prediction Questions	2	895	February 23, 2020
How to make out-of-sample predictions with pymc model v5	1	647	February 8, 2023
GP Latent implmenentation in PyMC version agnostic gaussian_process	4	64	February 23, 2025

PyMC plot pm.gp.util.plot_gp_dist

Related topics