When I look at the source code of pymc3, I see that the `logp`

of `Dirichlet`

would be `-np.inf`

if the random variable is out of bounds of the probability simplex. But `-np.inf`

is not differentiable. How does Dirichlet deal with the out of bound situation?

How would the integrator and mass matrix adaptation work with a logp that is `-np.inf`

?

The Dirichlet also use a “stick breaking” transformation. Does that mean reparametrizing the Dirichlet as independent Beta distributions that are bounded in the interval (0, 1)?

While the `bound`

condition in the source code seems to constraint the `value`

to be in the interval (0, 1), the comment states the `bound`

constrains the sum of values to be 1. Would an additional re-parametrization to logits transform the (0, 1) constraint to a (-\infty, \infty) constraint?

Thanks again.

Cut & paste from pymc3 source code:

```
class Continuous(Distribution):
def __init__(self, shape=(), dtype=None, defaults=('median', 'mean', 'mode'),
*args, **kwargs):
if dtype is None:
dtype = theano.config.floatX
super().__init__(shape, dtype, defaults=defaults, *args, **kwargs)
....
class Dirichlet(Continuous):
....
def __init__(self, a, transform=transforms.stick_breaking,
*args, **kwargs):
shape = np.atleast_1d(a.shape)[-1]
kwargs.setdefault("shape", shape)
super().__init__(transform=transform, *args, **kwargs)
...
def logp(self, value):
k = self.k
a = self.a
# only defined for sum(value) == 1
return bound(tt.sum(logpow(value, a - 1) - gammaln(a), axis=-1)
+ gammaln(tt.sum(a, axis=-1)),
tt.all(value >= 0), tt.all(value <= 1),
k > 1, tt.all(a > 0), broadcast_conditions=False)
def bound(logp, *conditions, **kwargs):
broadcast_conditions = kwargs.get('broadcast_conditions', True)
if broadcast_conditions:
alltrue = alltrue_elemwise
else:
alltrue = alltrue_scalar
return tt.switch(alltrue(conditions), logp, -np.inf)
```