# Multinomial returns nan

I played with the implemented multinomial distribution found this unexpected behavior of

import numpy as np
import pymc3 as pm
np.exp(pm.Multinomial.dist(n = 1, p = np.array([0, 1])).logp(np.array([0, 1])).eval())


My PyMC3 3.4.1 returns array([nan]) even know by definition this should be

\frac{n!}{\prod_{i=1}^k x_i!} \prod_{i=1}^k p_i^{x_i} = 0^0 \cdot 1^1 = 1.

However this

np.exp(pm.Multinomial.dist(n = 2, p = np.array([0, 1])).logp(np.array([1, 1])).eval())


correctly returns array([0.]).

I think one can work around this issue by sub setting p and x to the values where p \ne 0 or x \ne 0 but it would be nicer if this is done automatically.

1 Like

Yes the edge of the domain is not very well tested. @gBokiau did some great work to extend it but currently is still a lot of work to be done.

Also unexpectedly

np.exp(pm.Dirichlet.dist(a = np.array([2, 1])).logp(np.array([1, 0])).eval())


returns 0.0. But I assumed it to be

\frac{1}{B(a)} \prod_{i=1}^k x_i^{a_i-1} = \frac{1}{0.5} 1^1 \cdot 0^0 = 2.

The definition given here https://docs.pymc.io/api/distributions/multivariate.html#pymc3.distributions.multivariate.Dirichlet might has a typo since the last subscript k probably belongs to \theta like that:

f(x|\theta) = \frac{\Gamma( \sum_{i=1}^k \theta_i)} {\prod_{i=1}^k \Gamma\left( \theta_i\right)} \prod_{i=1}^{k-1} x_i^{\theta_i-1} \left(1 - \sum_{i=1}^{k-1} x_i \right)^{\theta_k}

This would make sense because it states

Only the first k-1 elements of x are expected.

and since it should hold that \sum_{i=1}^k x_i = 1 the term in the brackets should equal x_k. But when tested

np.exp(pm.Dirichlet.dist(a = np.array([2, 1])).logp(np.array([0.5, 0.5])).eval())


returns 1.0 which is correct but

np.exp(pm.Dirichlet.dist(a = np.array([2, 1])).logp(np.array([.5, 0])).eval())


returns 0.

So I assume whenever there is a 0 in x it will return 0 even if the corespoding value in a is 1.

Thereâ€™s a good chance this is solved in the master, as I donâ€™t think my commit is part of a release yet. Would you mind trying it out with the dev version @Dominik? And report an issue if itâ€™s failing with that as well (again, donâ€™t think it will, but who knows).

1 Like

I installed git+https://github.com/pymc-devs/pymc3 but the results seem to be the same. Does this code work in your version:

import numpy as np
import pymc3 as pm
np.exp(pm.Multinomial.dist(n = 1, p = np.array([0, 1])).logp(np.array([0, 1])).eval())


Should I use a different branch?

Thanks for tryingâ€¦ no that means thereâ€™s more to it. I donâ€™t have my setup in front of me, but it probably wouldnâ€™t work either. Will open an issue.

Ah you beat me to it Thanks!

You are welcome! Thanks for the support

FYI, scipy also returns nan for:

st.multinomial.pmf(np.array([0, 1]), 1, np.array([0, 1]))


And for dirichlet scipy return error for array containing 0:

ValueError: Each entry in 'x' must be greater than zero.


This is problematic in multinomial/softmax regression scenarioâ€™s though, for which you definitely would need p to support 0.
I canâ€™t immediately think of a scenario with Dirichletâ€¦ but by analogy I would presume the same holds?

Agree about multinomial. But Dirichletâ€™s support is (0, 1).

I would argue that a support of p=0, x=0 in the multinomial case implies a support of p=0, a=1 in the Dirichlet case. If f(x|p) is the multinomial distribution and f(p) is uniform on the simplex then by Bayes theorem

f(p|x) = \frac{f(x|p)f(p)}{f(x)} = f_{dirichlet}(p|a)

for a = x+1. Hence there should be support for a=1.

In particular for my application I need to draw from different Dirichlet distributions, all with the same shape, but sometimes a given category i has no members a_i=1. However one could achieve the same result by removing this category from the vector a and append p_i=0 to the result afterwards.

Scipy also does some other funny stuff. E.g. scipy.stats.lognorm.pdf(0, 1) returns 0.0 even know the definition includes a term 1/x and hence should not support x=0. Nevertheless this continuous continuation can be useful

Edit: Corrected x_i=0 to p_i=0.

This is a good point. Would you like to send a pull request on Github to fix this?

I can try, but i am not very familiar with theano yet

1 Like
1 Like