While I was searching for something completely unrelated, I came across this old thread about beta parametrization, and it links through to “implicit normalization of gradients”:
1 Like
While I was searching for something completely unrelated, I came across this old thread about beta parametrization, and it links through to “implicit normalization of gradients”: