I have a fairly complex regression model which I can successfully obtain a posterior for. I need to extract a single point estimate for the coefficients from the posterior for use in my wider workflow. I realise that I should use the full posterior, run it through the entire workflow and only make a decision at the end, however my setup doesn’t allow for this (at the moment).
Are there any suggested methods for extracting a single set of coefficients? Here it simply says
In summary, while PyMC3 provides the function find_MAP()
, at this point mostly for historical reasons, this function is of little use in most scenarios. If you want a point estimate you should get it from the posterior. In the next section we will see how to get a posterior using sampling methods.
I can think of a few methods:
1/ Use find_MAP()
2/ From the posterior use the point with the highest log-likelihood
3/ Use the mean of posterior
4/ Use the median of posterior
5/ Have some sort of distance metric and select the point that is closest to all others
Outputs from 2 and 5 would be an actual sample from the posterior. Whereas 3 & 4 would not be.
A few notes on the model:
- It is hierarchical
- Most the RVs are Normal however a few have mean!=median.
- There are ~20 RVs
- Due to time constraints I only run a small number of draws, the ESS is typically ~200.