Gumbel-Softmax version of Bernoulli and Categorical distributions

I’m not familiar with these parametrizations – do you have a reference?

In general, a different parametrization may change the fitting of a particular model, but the details will matter in exactly how it changes. symbolic-pymc is an interesting project to do things like this automatically, and Hoffman, Johnson, and Tran had a paper on this last year.