Dimension error when using az.plot_lm() when running tutorial notebook?

I am trying to run this tutorial notebook, and on PyMC 4.3, the following call returns an error:

az.plot_lm(
    idata=idata,
    y="obs",
    x=predictor_scaled,
    kind_pp="hdi",
    y_kwargs={"color": "C0", "marker": "o", "ms": 4, "alpha": 0.4},
    y_hat_fill_kwargs=dict(fill_kwargs={"alpha": 0.8}, color="xkcd:jade"),
    axes=ax,
)

The error message is ValueError: x and y must have same first dimension, but have shapes (100,) and (1,).

In contrast, the following minimalist example from arviz does work:

import arviz as az
import numpy as np
import xarray as xr
idata = az.load_arviz_data('regression1d')
x = xr.DataArray(np.linspace(0, 1, 100))
idata.posterior["y_model"] = idata.posterior["intercept"] + idata.posterior["slope"]*x
az.plot_lm(idata=idata, y="y", x=x)

I tried to inspect the two idata’s manually but couldn’t spot any issue. Does anyone have an idea?

Hi, sorry the question slipped through the cracks. I recently fixed a bug in plot_lm but I think it was a different one, and the notebook no longer uses it. Do you have a minimal example that shows the issue? data organization and facetting in plot_lm is quite convoluted but it might be fixable, I can take a look given the example to quickly play around and test it

Hi, I’ve experienced the same issue and what solved it for me was to drop unwanted dimensions. It seems that

idata.posterior["y_model"]

is, e.g. of shape (1, 1000, 1, 393, 1), containing (‘chain’, ‘draw’, ‘predictors’, ‘dim_0’, ‘dim_1’).
az.plot_lm says that y_model needs to be of the same shape as y, plus added chains and draws. So, we want to drop some dimensions, namely ‘predictors’ and ‘dim_1’.

idata.posterior["y_model"] = idata.posterior["y_model"].squeeze(dim=["predictors", "dim_1"], drop=True)

Now, y_model is of shape (1, 1000, 393) and y is of shape (393,).

1 Like

You can also used the extract() method from arviz, which accomplishes much the same thing.

That is very useful, thanks! It would probably be better to modify the code within plot_lm and add an squeeze for dimensions (other than chain and draw) that are in y_model or y_hat but not in observed data, or at least check and raise a more informative warning/error.