[Edit: This isn’t quite what’s going on in all these systems like NONMEM, TMB, and lme4. See clarifying posts below.]
This can be almost arbitrarily bad. What works better for a hierarchical model is a max marginal “Laplace” approximation. That is, marginalize out the low-level parameters, optimize the high-level parameters, plug those back in to optimize the low level parameters, then lay down a second-order Taylor series approximation at that point (it’s not quite a Laplace approximation as it’s not the mode and technically the first order term will not drop out, but we’ll pretend it does here).
That is, if we have posterior p(\alpha, \beta), derive the marginal
\qquad p(\alpha) = \int_B p(\alpha, \beta) \textrm{d}\beta,
where B is the range of \beta. Then optimize that to estimate \alpha,
\qquad \alpha^* = \textrm{arg max}_\alpha \ p(\alpha).
Now plug that point estimate back in to get
\qquad p(\beta \mid \alpha^*) \propto p(\alpha^*, \beta),
and optimize again to get
\qquad \beta^* = \textrm{argmax}_\beta \ p(\beta \mid \alpha^*).
Now if you take the negative Hessian at (\alpha^*, \beta^*), that gives you a precision matrix
\qquad \Sigma^{-1} = -H(\alpha, \beta) = -\nabla_{\alpha, \beta} \nabla_{\alpha, \beta} \log p(\alpha, \beta),
that you can plug into a multivariate normal, e.g.,
\qquad (\alpha, \beta) \sim \textrm{multiNormal}((\alpha^*, \beta^*), \Sigma).
The really cool thing here is that this Hessian is often sparse, so that you can evaluate the approximation density and sample from the Laplace approximation very efficiently with only sparse-precision/vector products. If you try to invert this to get the inverse mass matrix you’d need for Hamiltonian Monte Carlo, it becomes dense again. So instead you use it as a preconditioner on the density, where it never needs to leave sparse form.
If you just take the approximation and Hessian approximation, you are basically computing the same point estimate and uncertainty as lme4 or TMB do in R.
Here’s a paper showing how to do all this and use it as an efficient full-rank preconditioner for Hamiltonian Monte Carlo sampling. Sorry for the self-plug, but this is really all Cole’s work.