# Multinomial returns nan

I played with the implemented multinomial distribution found this unexpected behavior of

import numpy as np
import pymc3 as pm
np.exp(pm.Multinomial.dist(n = 1, p = np.array([0, 1])).logp(np.array([0, 1])).eval())


My PyMC3 3.4.1 returns array([nan]) even know by definition this should be

\frac{n!}{\prod_{i=1}^k x_i!} \prod_{i=1}^k p_i^{x_i} = 0^0 \cdot 1^1 = 1.

However this

np.exp(pm.Multinomial.dist(n = 2, p = np.array([0, 1])).logp(np.array([1, 1])).eval())


correctly returns array([0.]).

I think one can work around this issue by sub setting p and x to the values where p \ne 0 or x \ne 0 but it would be nicer if this is done automatically.

1 Like

Yes the edge of the domain is not very well tested. @gBokiau did some great work to extend it but currently is still a lot of work to be done.

Also unexpectedly

np.exp(pm.Dirichlet.dist(a = np.array([2, 1])).logp(np.array([1, 0])).eval())


returns 0.0. But I assumed it to be

\frac{1}{B(a)} \prod_{i=1}^k x_i^{a_i-1} = \frac{1}{0.5} 1^1 \cdot 0^0 = 2.

The definition given here https://docs.pymc.io/api/distributions/multivariate.html#pymc3.distributions.multivariate.Dirichlet might has a typo since the last subscript k probably belongs to \theta like that:

f(x|\theta) = \frac{\Gamma( \sum_{i=1}^k \theta_i)} {\prod_{i=1}^k \Gamma\left( \theta_i\right)} \prod_{i=1}^{k-1} x_i^{\theta_i-1} \left(1 - \sum_{i=1}^{k-1} x_i \right)^{\theta_k}

This would make sense because it states

Only the first k-1 elements of x are expected.

and since it should hold that \sum_{i=1}^k x_i = 1 the term in the brackets should equal x_k. But when tested

np.exp(pm.Dirichlet.dist(a = np.array([2, 1])).logp(np.array([0.5, 0.5])).eval())


returns 1.0 which is correct but

np.exp(pm.Dirichlet.dist(a = np.array([2, 1])).logp(np.array([.5, 0])).eval())


returns 0.

So I assume whenever there is a 0 in x it will return 0 even if the corespoding value in a is 1.

There’s a good chance this is solved in the master, as I don’t think my commit is part of a release yet. Would you mind trying it out with the dev version @Dominik? And report an issue if it’s failing with that as well (again, don’t think it will, but who knows).

1 Like

I installed git+https://github.com/pymc-devs/pymc3 but the results seem to be the same. Does this code work in your version:

import numpy as np
import pymc3 as pm
np.exp(pm.Multinomial.dist(n = 1, p = np.array([0, 1])).logp(np.array([0, 1])).eval())


Should I use a different branch?

Thanks for trying… no that means there’s more to it. I don’t have my setup in front of me, but it probably wouldn’t work either. Will open an issue.

Ah you beat me to it Thanks!

You are welcome! Thanks for the support FYI, scipy also returns nan for:

st.multinomial.pmf(np.array([0, 1]), 1, np.array([0, 1]))


And for dirichlet scipy return error for array containing 0:

ValueError: Each entry in 'x' must be greater than zero.


This is problematic in multinomial/softmax regression scenario’s though, for which you definitely would need p to support 0.
I can’t immediately think of a scenario with Dirichlet… but by analogy I would presume the same holds?

Agree about multinomial. But Dirichlet’s support is (0, 1).

I would argue that a support of p=0, x=0 in the multinomial case implies a support of p=0, a=1 in the Dirichlet case. If f(x|p) is the multinomial distribution and f(p) is uniform on the simplex then by Bayes theorem

f(p|x) = \frac{f(x|p)f(p)}{f(x)} = f_{dirichlet}(p|a)

for a = x+1. Hence there should be support for a=1.

In particular for my application I need to draw from different Dirichlet distributions, all with the same shape, but sometimes a given category i has no members a_i=1. However one could achieve the same result by removing this category from the vector a and append p_i=0 to the result afterwards.

Scipy also does some other funny stuff. E.g. scipy.stats.lognorm.pdf(0, 1) returns 0.0 even know the definition includes a term 1/x and hence should not support x=0. Nevertheless this continuous continuation can be useful Edit: Corrected x_i=0 to p_i=0.

This is a good point. Would you like to send a pull request on Github to fix this?

I can try, but i am not very familiar with theano yet 1 Like
1 Like