Summarize inference data (HDI)

Hi!

When I summarize the statistics of my inference data using az.summary(), one of the columns of the dataframe is hdi_3%. A highest density interval is supposed to be a range of two values, but in the summarized data each entry of this column is one specific value. What does this value mean?

I am sorry if this is a very basic question.

Thank you in advance!

1 Like

When requesting the 94% HDI (arviz’s current default), there should be 2 columns, one representing the left end of the interval (the value below which 3% of the posterior falls) and one representing the right end of the interval (the value above which 3% of the posterior falls). So hdi_3% is the former and there should be a hdi_97% column with the latter.

1 Like

To just add to @cluhmann , I found the following command very handy:

az.hdi(trace,var_names=["XXX"], hdi_prob = 0.80).values()

It will print out the hdi of whatever variables you want from your trace with the specified hdi_prob value.

2 Likes

Hi!

I am sorry (again) if this is a very basic question, but I am really confused. I understand that the HDI gives us the lowest credible interval of a distribution that gives us a specific probability. How do we calculate the HDI for samples from a distribution?

In addition, I cannot understand the difference between a HDI and a confidence interval.

The arviz code used to compute the HDI is here.

The difference between an HDI and a confidence interval is a much longer discussion. I recommend the wikipedia articles here and here.

3 Likes