How do I set the arviz axis to something readable when comparing the prior predictive?

Hello,

I’m trying to compare my prior predictive vs. actuals using this tutorial:
Introductory Overview of PyMC — PyMC 5.0.2 documentation

az.plot_dist(
    df1["residual"].values,
    kind="hist",
    color="C1",
    # hist_kwargs=dict(alpha=0.6),
    label="observed"
    # ax = ax
)

az.plot_dist(
    idata.prior_predictive["predicted_eaches"],
    kind="hist",
    # hist_kwargs=dict(alpha=0.6),
    label="simulated",
)
plt.xticks(rotation=45);

This results in the following:

How do I get arviz to stretch the X-axis out where I can it’s readable? I know by the above that these are bad priors but I’d like to start with something readable so I can demonstrate how changing parameters in the RVs or adding variables changes this.

ArviZ plot functions return a matplotlib axes object. You should be able to manipulate various elements of the plot (e.g., set_xlim()) from there. Does that help?

1 Like

Thank you. I tried the following but got the same result.

az.plot_dist(
    df1["residual"].values,
    kind="hist",
    color="C1",
    # hist_kwargs=dict(alpha=0.6),
    label="observed"
    # ax = ax
)
ax.set_xlim(df1["residual"].values.min(), idata.prior_predictive["predicted_eaches"].max())
az.plot_dist(
    idata.prior_predictive["predicted_eaches"],
    kind="hist",
    # hist_kwargs=dict(alpha=0.6),
    label="simulated",
)
ax.set_xlim(df1["residual"].values.min(), idata.prior_predictive["predicted_eaches"].max())
plt.xticks(rotation=45);

Are things still squashed if you plot the 2 distributions separately? I’m not entirely sure how those 2 are ending up on the same axis given the 2 calls you pasted here (though I’m not sure I have tried doing exactly what you have).

They are not squashed when plotted separately.

@OriolAbril would be able to answer definitively, but I might suggest creating a single axis by hand (e.g., plt.subplots() or plt.figure().gca(), etc.) and then passing that single axis into each call to plot_hist (i.e., az.plot_hist(...ax=my_axis)). I suspect the issue is caused by arviz automatically trying to create an axis for each plot. Then you should be able to tweak the axis as you wish. Not 100% on that, but it seems reasonable.

Thanks. I tired the below with the same result. I’ll try some other things to see if I can find a work around.

I figured this out…just use the observed values from the trace object.

I’m still curious what arviz is doing under the hood to cause this. Any clue @OriolAbril ?

The main convention for histograms is to label the “mark” (mid point generally, otherwise useful/relevant id) of each bin and to do so explicitly. When done independently both plots look more or less normal, but when overlaying them such labeling becomes problematic unless the histograms share the bins.

As what is labeled isn’t really the axis but the histogram bins, if we label using the histogram with greater range we should get visible labels that expand most of the axis, it will generally look good, but you might not be able to actually interpret the narrower histogram. If viceversa, you’ll get only labels on the section of the plot covered by the narrower range, which generally means way too many labels in too little horizontal space