Gumbel-Softmax version of Bernoulli and Categorical distributions

Hamster_on_wheels · May 31, 2019, 1:24pm

Does pymc have gumbel-softmax reparametrization of Bernoulli and Categorical?
Does that reparametrization actually improve fitting of Bernoulli and Categorical?
Is it possible to use a random variable for the temperature?

Thanks again.

colcarroll · May 31, 2019, 7:47pm

I’m not familiar with these parametrizations – do you have a reference?

In general, a different parametrization may change the fitting of a particular model, but the details will matter in exactly how it changes. symbolic-pymc is an interesting project to do things like this automatically, and Hoffman, Johnson, and Tran had a paper on this last year.

Hamster_on_wheels · May 31, 2019, 8:16pm

2016 paper at: https://arxiv.org/abs/1611.01144

tfp has a ExpRelaxedOneHotCategorical but the temperature must be a float. The Exp part of the name refers to another reparametrization that avoids underflow in the logp.

As the temperature \tau approaches 0, the gumbel softmax version becomes a categorical distribution.

I don’t get why their probability density is well-defined because some y_{i} can be 0. Then I get division by 0 in the \pi_{i} / y_{i}^{\tau+1} terms.

For a categorical variable Y with k = 3 classes, the 3 - 1 dimensional simplex is an ordinary triangle.

Let \vec{y} \in \Delta^{k} be one-hot encoding of the categorical variable with k classes, where \Delta^{k-1} is a (k - 1) dimensional probability simplex.
Let \pi_{i} = P(Y = i) be the probability that the categorical random variable have a value of i.

P_{\pi, \tau}(\vec{y}) = \Gamma(k) \tau^{k-1} \left( \sum_{i=1}^{k} \pi_{i} / y_{i}^{\tau} \right)\prod_{i=1}^{k} (\pi_{i} / y_{i}^{\tau+1})

I guess \Gamma(k) means factorial of k when k is an integer.

Hamster_on_wheels · May 31, 2019, 8:25pm

torch has a gumbel_softmax too

https://pytorch.org/docs/stable/_modules/torch/nn/functional.html

They have an epsilon. Maybe they clamp the y_i to avoid division by zero?

Hamster_on_wheels · May 31, 2019, 8:50pm

Why the paper says a categorical distribution is non-differentiable?
I was thinking P(\vec{\pi}) = \sum_{i = 1}^{k} y_{i} \pi_{i}.

Topic		Replies	Views
PR proposal -- API for Multinomial-Softmax and Cholesky decomposition Development development	2	598	November 11, 2019
Problem with evaluation of parameters of gumbel distribution Questions	5	768	March 10, 2021
Why doesn't pymc.Multinomial support logit_p parameterization? v5 modeling	3	42	May 15, 2025
Trouble specificying X \| a, b, c, d ~ Categorical( . ) Questions	5	494	March 2, 2019
CategoricalGibbsMetropolis vs HMC Questions	1	879	June 1, 2019

Gumbel-Softmax version of Bernoulli and Categorical distributions

Related topics