torch has a gumbel_softmax too
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html
They have an epsilon. Maybe they clamp the y_i to avoid division by zero?
torch has a gumbel_softmax too
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html
They have an epsilon. Maybe they clamp the y_i to avoid division by zero?