Computational complexity of NUTS algorithm

I want to compare the overall computational complexity of sampling directly from a high-dimensional Multivariate Normal distribution, with sampling via an MCMC framework using the NUTS sampler.

When drawing samples from a MVN distrubtion directly, the bottleneck for the computational complexity is the Cholesky decomposition with O(n^3). For spatial GMRFs with a sparse precision matrix, this can be reduced to O(n^3/2) (Rue and Held (2005)).

According to a notebook on CAR implementation in pymc3 ( ), evaluating the log density in the last example with a sparse precision matrix can be done in linear time. However, within the NUTS framework this is only one step. So I am wondering what is the overall computational complexity (number of flops) including the proposition of samples with NUTS and the accept/reject step?