How is the mode point estimate calculated by the `plot_posterior` method?

mrzeliboba · June 24, 2020, 6:27am

When plotting the posterior from the trace, you can choose the point_estimate='mode'. I also need the value of a mode as a separate statistic, which is unfortunately not incorporated into default statistics of stats.summary() method. So, I pass in the stats_funcs parameter a wrapper for scipy.stats.mode (because by default scipy.stats.mode returns additionally number of bins, which has to be disposed). Now, if I compare what value plot_posterior(point_estimate='mode') is showing and the value scipy.stats.mode is returning, these are two super different in my case. E.g. I get 7.339 for plot_posterior() and scipy.stats.mode() method returns 29.097, which is more plausible value.
Now, question is: how is mode calculated by the plot_posterior method, where this discrepancy from scipy implementation can come from and what I should rely on?

mrzeliboba · June 24, 2020, 6:33am

To add, I now have checked, scipy.stats.mode is making its calculations based on only 30 bins binning, which is super law value for a range of values my posterior is spanning over. Still it would be interesting to know how mode is calculated inside the PyMC3, how reliable it is and how to calculate it with the tools of PyMC3, rather than third-party libs, which results in such discrepancies.

junpenglao · June 24, 2020, 6:35am

PyMC3 use Arviz which for continuous value use a kde function to smooth out the histogram and get the local maximum:

github.com

arviz-devs/arviz/blob/1d9297c0f06ceb65211e574338610f14f9d004c9/arviz/plots/plot_utils.py#L646-L648


density, lower, upper = _fast_kde(values, bw=bw)
x = np.linspace(lower, upper, len(density))
point_value = x[np.argmax(density)]

It might not work well if you have multi mode but otherwise we think it is quite good in most case

mrzeliboba · June 24, 2020, 7:20am

Thank you, Junpeng, for such a fast response. Now it have become even more weird, bacause calling az.plots.plot_utils.calculate_point_estimate('mode', trace['tau'], bw=4.5) results into array([29.09679086]) – super close to the value returned by the scipy.stats.mode. Yet, plot_posterior shows 7.339. The only parameter which could have effect (bw parameter) does not really have effect. So I am now wondering what could be the reason plot_posterior is showing such a value of mode.

mrzeliboba · June 24, 2020, 8:54am

More info: when calculated through passing the stats_funcs parameter of the az.stats.summary(), it results in the same value of 7.339, so something happens to the trace on the way, which is not done when calling az.plots.plot_utils.calculate_point_estimate('mode', trace['tau'], bw=4.5) directly.

junpenglao · June 24, 2020, 8:59am

hmmm sounds like a bug - could you raise an issue on Arviz?

mrzeliboba · June 26, 2020, 6:38am

Thank you, I will!

marissafichera · August 23, 2023, 5:36pm

Has this been fixed? I’m running into a similar issue

Topic		Replies	Views
Getting the mode of a posterior Questions	2	1361	October 23, 2019
How to reference posterior mode value without .find_MAP? Questions	3	1201	August 8, 2019
Getting point estimate from posterior Questions	2	1708	January 22, 2021
Common reasons for getting a map estimate that is far from the mode of the posterior v5	9	1212	April 29, 2023
How to verify that uncertainty (estimated from pymc3) is accurate? Questions	29	4193	July 30, 2021

How is the mode point estimate calculated by the `plot_posterior` method?

Related topics