Lm_plot with bambi

I’m trying to create a regression plot with lm_plot in conjunction with bambi. I passed in the inference data object into lm_plot, but I don’t what other parameters I’m supposed to pass in to make it work. Any examples of how to create these regression plots with bambi would be helpful. Thank you!

1 Like

Hello!

Do you have an example of what you’re trying to do?

However, I think it would be better to use plot_cap from Bambi.

Do

from bambi.plots import plot_cap

plot_cap(model, idata, ["predictor"])

And it should work.

I want to see the the predictive posterior samples in addition to the observed data points. Here is an example of what I’m trying to do:

I’m not sure how to do this with inference data returned from the bambi model. When I put it in the arviz plot_lm function, it’s gives me the following error:

ValueError: x and y must have same first dimension, but have shapes (200,) and (1,)

If you have a minimum reproducible example of your data and model, I can help you with the code for the visualization :slight_smile:

So I was able to figure it out by doing it this way:

model = bmb.Model('y ~ x', train_data, dropna=True)
fitted = model.fit(draws=5000, chains=4)
idata = model.predict(fitted, kind='pps', inplace=False)
data = az.from_dict(
    posterior={'y_mean': idata.posterior.y_mean},
    observed_data = {'y': idata.observed_data.y},
    posterior_predictive = {'y': idata.posterior_predictive.y},
    dims={"y": ["x"]},
    coords={"x": train_data['x']}
)
az.plot_lm(idata=data, y='y', y_model='y_mean')

Now, I’m trying to create an identical plot with plotly. I’m building an interactive application and would like to include this visualization. I’m not sure which data format the posterior_predictive samples should be in order to plot it.

1 Like

This is a fully reproducible example with Bambi, using Matplotlib.

Some highlights

  • you need to sort the values of the predictor so the lines and the bands are properly displayed
  • we’re not plotting all the samples from the posterior predictive distribution (arviz does the same)
  • you could reuse this to create your plotly stuff
import arviz as az
import bambi as bmb
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd


rng = np.random.default_rng(1234)
b0, b1, sigma = 1.5, 0.6, 1.5
x = rng.uniform(2, 10, size=200)
y = rng.normal(b0 + b1 * x, sigma)

data = pd.DataFrame({"x": x, "y": y})

model = bmb.Model("y ~ 1 + x", data)
idata = model.fit()
model.predict(idata, kind="pps")

sort_idxs = np.argsort(data["x"])
x_sorted = data["x"][sort_idxs]

y_mean = idata.posterior["y_mean"].mean(("chain", "draw")).to_numpy()
y_mean_bands = idata.posterior["y_mean"].quantile((0.025, 0.975), ("chain", "draw")).to_numpy()

# plot_lm also uses `num_samples=50`
y_pp = az.extract(idata.posterior_predictive, num_samples=50)["y"].to_numpy().T

fig, ax = plt.subplots()
for y_values in y_pp:
    ax.scatter(data["x"], y_values, color="C0", alpha=0.1)
ax.scatter(data["x"], data["y"], color="C1");

ax.fill_between(
    x_sorted, 
    y_mean_bands[0][sort_idxs],
    y_mean_bands[1][sort_idxs], 
    color="0.4", 
    alpha=0.7
)
ax.plot(x_sorted, y_mean[sort_idxs], color="black", lw=1.55);

image

2 Likes