I think it’s also important to realize this equivalence:
rng = np.random.default_rng(1234)
beta_location_1 = rng.normal()
beta_location_year_1 = rng.normal(loc=beta_location_1, size=n_years)
rng = np.random.default_rng(1234)
beta_location_2 = rng.normal()
beta_year = rng.normal(size=n_years)
beta_location_year_2 = beta_location_2 + beta_year
np.allclose(beta_location_year_1, beta_location_year_2)
>>> Out: True
A model with additive parameters is a hierarchy, just written in a different way. You can compare how I wrote beta_location_year_2 with a non-centered parameterization to see that the years will be clustered at the mean determined by the location, exactly as you describe (just imagine there is a sigma_year hanging out in front of beta_year).