If it’s like Stan’s implementation, it just follows the paper:
The struggle is to find a good step size to deal with the stochastic gradient descent (the stochasticity arises from evaluating the KL divergence up to a constant using the evidence lower bound (ELBO) with Monte Carlo draws from the approximating normal distribution). There has been a series of experiments showing that ADVI is much more stable with large numbers of iterations for the ELBO evaluation and with a stick-the-landing reparameterization gradient (especially if the number of evals for ELBO isn’t massive, i.e., in the 100K range) and much more care than in the original algorithm selecting step sizes.