Using find_MAP and hessian to find frequentist standard errors, t-stats and p-values

phildias · July 24, 2018, 3:45pm

Hello,

I’ve run pm.find_MAP and gotten the maximum likelihood estimates from the pymc model I’ve originally set up.

However, my model is setup such that some of the parameters to be estimated are packed/stacked into arrays. Also, since I work with the cholesky decomposition of a covariance matrix, I also create a couple of extra deterministic variables that make it easier for me to see the actual covariances, variances and correlations. So all of these extra elements show up in my MAP estimate.

Just to illustrate my example, this is what the output from find_MAP looks like:

{'Betas_Y1': array([1.04077349, 0.99860613, 1.98801345, 3.00113906, 3.96025654]),
 'Betas_Y2': array([ 1.07404858, -0.89988903, -1.96119784, -2.98798061, -3.99543953, 3.00343968]),
 'chol_cov_cholesky-cov-packed__': array([ 1.08543907, -3.19124886,  0.67209238]),
 'chol_cov': array([ 2.9607395 , -3.19124886,  1.95833061]),
 'cov_terms': array([ 8.76597837, -9.44845656, 14.01912809]),
 'sd': array([2.9607395 , 3.74421261]),
 'cor_terms': array([-0.85231508]),
 'mu': array([[  4.54342693,  14.53801376],
              [ 18.82521416,  -5.8585966 ],
              [ 11.95251942,  16.09560137],
              ...,
              [-23.4664827 ,   5.86015983],
              [ 18.02911431,  -8.4842003 ],
              [ -4.64908722,  -2.54813962]])}

Finally, my main question: how should I run the pm.find_hessian command do estimate the standard errors, t-stats and p-values for my model? Should I just feed the output from find_MAP into find_hessian? If so, am I sure the order is being maintained? And after I get the results, what’s the order of the parameters? How can I relate each output of find_MAP to a row/column of the hessian?

Also I’ve noticed that the hessian is only calculated with respect to non-deterministic variables that don’t have observed data associated with them. Is there any way to use the find_hessian function to find standard errors and t-stats for some of my deterministic variables, such as the correlation terms and variances? This is because the standard errors of the elements in the cholesky decomposition aren’t that important to me - what really matters are the standard errors of the elements in the actual covariance matrix.

Thank you!

junpenglao · July 24, 2018, 4:53pm

There is a mapping function to map the array to a dict (point for pymc3 model) and reverse mapping the array to a dict, eg see

github.com

pymc-devs/pymc3/blob/aaaa00fe6a17f1687cff819f7a42b0edb24db70b/pymc3/tuning/starting.py#L71-L74


start = Point(start, model=model)
bij = DictToArrayBijection(ArrayOrdering(vars), start)
logp_func = bij.mapf(model.fastlogp_nojac)
x0 = bij.map(start)

And I will very strongly against computing the standard errors, t-stats and p-values for your model, why look at one point when you have the whole sky (ie the whole posterior space)??

tymwol · March 31, 2020, 3:29pm

Could you expand the answer @junpenglao a little bit? How is this used when using pm.find_hessian ?

Topic		Replies	Views
Best practices for MAP estimation Questions	3	1223	September 6, 2019
Unconsistent logp, Evaluating the results of find_MAP on Custom Likelihood Questions	3	581	August 28, 2018
find_MAP struggles with hierarchical model version agnostic	3	45	February 19, 2025
Maximum likelihood estimation of a Bayesian model Questions	11	5444	April 12, 2018
Evaluating log likelihood using estimates from find_MAP results Questions	3	1336	January 25, 2020

Using find_MAP and hessian to find frequentist standard errors, t-stats and p-values

Related topics