@junpenglao my understanding is that the idea here was to mix sampling of free variables from the two methods: some with NUTS (logp graph) and others via forward draws (RV graph).
The step method is needed only to separare the two types of draws, so that the different types of variables can be held constant across the two graphs: RVs+forward draws held constant while doing logp+NUTS draws and vice-versa (assuming a bidirectional interdependence, although in the original example here the forward RV graph does not depend on any of the free variables being sampled with NUTS)