Plot of mean and HDI is really off

Hi All,

I was practicing pymc3 on spline regression in statistical rethinking chapter 4.
the plot between the median of a parameter and hdi looks very strange

following is my code

_, ax = plt.subplots(1, 1, figsize=(12, 4))
mu_bar = np.mean(trace4_7['mu'],axis=0)
mu_p = trace4_7["mu"]
ax.plot(d2.year, d2.temp, '.',alpha=0.5)
ax.plot(d2.year, mu_bar)
for i in np.random.randint(0,2000,100):
  ax.plot(d2.year, trace4_7['mu'][i],color='k', alpha=0.1)
pm.plot_hdi(d2.year, y=mu_p,ax=ax)
plt.show()

not able to find the problem. looks quite strange. The black curves are randomly sampled and the green shaded graph is plotted by using the plot_hdi function

By changing the script to

ax.fill_between(d2.year,az.hdi(mu_p)[:,0],az.hdi(mu_p)[:,1],color="red",alpha=0.5)

Now the graph looks ok and closely following the mean value

I am using pymc3 version 3.9.

is this a bug or something?

Hi @saurpan!
You can find the PyMC3 port of this chapter here – I guess it’ll be useful!

Hi @AlexAndorra thanks for the link. but the issue is the code in that notebook is for different variables and the plot is only the hdi_plot.
The puzzling part is when I plot using pm.plot_hdi() the plot is different than when I am calculating HDI using pm.HDI() and plotting it by using plt.fill_between() .
logically these two should give me the same plot.

First of all, I’d recommend you to update to latest ArviZ and PyMC3 if possible, if not please share your ArviZ version too.

I think that the issue here is with the conversion (or no conversion probably) of your trace to InferenceData. I’d recommend you first convert your data to InferenceData with az.from_pymc3 (this blogpost of mine might help too in addition to the docs). Then use InferenceData or xarray objects to call hdi and plot_hdi.

My guess with the info available is that hdi is converting the trace to inferencedata internally and the automatic conversion works fine -> the plot then looks fine whereas plot_hdi is interpreting the trace as an array and giving nonsensical results.

Side note, keep in mind that the black curves and the hdi you are plotting follow the distrubution of mu which I am guessing is the distribution of the mean of your observations, they do not follow the distribution of the observations.

1 Like

Thanks @OriolAbril. I will follow the suggestions and post here.
Yes you are right I am plotting the mean of my observations.