Two quick things. First, wall time is obviously important but doesn’t mean much when comparing reasonably similar PPLs/models/sampling algorithms. I am seeing considerably higher ESS (e.g., 2x higher overall) and lower \hat{r} from v4 than from v3. Second, with v3 I am seeing lower-than-target acceptance rates and small numbers of divergences, both of which can artificially yield “faster” sampling at the expense of useful information (consistent with the lower ESS).
3 Likes