Getting point estimate from posterior

Hello

I have a fairly complex regression model which I can successfully obtain a posterior for. I need to extract a single point estimate for the coefficients from the posterior for use in my wider workflow. I realise that I should use the full posterior, run it through the entire workflow and only make a decision at the end, however my setup doesn’t allow for this (at the moment).

Are there any suggested methods for extracting a single set of coefficients? Here it simply says

In summary, while PyMC3 provides the function find_MAP() , at this point mostly for historical reasons, this function is of little use in most scenarios. If you want a point estimate you should get it from the posterior. In the next section we will see how to get a posterior using sampling methods.

I can think of a few methods:
1/ Use find_MAP()
2/ From the posterior use the point with the highest log-likelihood
3/ Use the mean of posterior
4/ Use the median of posterior
5/ Have some sort of distance metric and select the point that is closest to all others

Outputs from 2 and 5 would be an actual sample from the posterior. Whereas 3 & 4 would not be.

A few notes on the model:

  • It is hierarchical
  • Most the RVs are Normal however a few have mean!=median.
  • There are ~20 RVs
  • Due to time constraints I only run a small number of draws, the ESS is typically ~200.

Any thoughts?

As you suggest, the most “Bayesian” answer is to not do this and the documentation regarding the MAP reflects this. But if you must, then you need to select an estimator and do so considering how collapsing your full posterior into a single point estimate will influence downstream inferences. Your tolerance for different kinds of consequences should shape what estimator you ultimately select. The mean, median, and mode, for example all reflect different loss functions. Keep in mind that some of these estimators may seem more reasonable assuming “well-behaved” distributions (e.g., symmetric, unimodal, etc.) but that posteriors aren’t always so cooperative.

We had some related discussion here.

3 Likes

I completely agree with cluhmann’s answer. I would just add that the posterior mode for continuous variables doesn’t really correspond to any choice of loss function (see passage after “When the loss function is of the form…” here Maximum a posteriori estimation - Wikipedia). In addition, find_MAP can have issues with hierarchical models. So I would suggest steering away from that. To choose from the others, you can take a look at cluhmann’s resources. The posterior mean, for example, minimises the mean square error loss and can be reasonable choice in many cases.

3 Likes