No answers here, but I would also be interested in knowing how to do things along these lines. It seems like the sticking point is getting access to the tuned sampler. You can get the initialized sampler from init_nuts and you can pass that to sample() (or iter_sample()). But once the sampler is sampling (e.g., tuning), it seems like the only thing you can easily access to is the resulting trace (not current state of the sampler/step methods).