Multinomial returns nan

Dominik · June 22, 2018, 6:29pm

I played with the implemented multinomial distribution found this unexpected behavior of

import numpy as np
import pymc3 as pm
np.exp(pm.Multinomial.dist(n = 1, p = np.array([0, 1])).logp(np.array([0, 1])).eval())

My PyMC3 3.4.1 returns array([nan]) even know by definition this should be

\frac{n!}{\prod_{i=1}^k x_i!} \prod_{i=1}^k p_i^{x_i} = 0^0 \cdot 1^1 = 1.

However this

np.exp(pm.Multinomial.dist(n = 2, p = np.array([0, 1])).logp(np.array([1, 1])).eval())

correctly returns array([0.]).

I think one can work around this issue by sub setting p and x to the values where p \ne 0 or x \ne 0 but it would be nicer if this is done automatically.

junpenglao · June 22, 2018, 7:39pm

Yes the edge of the domain is not very well tested. @gBokiau did some great work to extend it but currently is still a lot of work to be done.

Dominik · June 22, 2018, 7:46pm

Also unexpectedly

np.exp(pm.Dirichlet.dist(a = np.array([2, 1])).logp(np.array([1, 0])).eval())

returns 0.0. But I assumed it to be

\frac{1}{B(a)} \prod_{i=1}^k x_i^{a_i-1} = \frac{1}{0.5} 1^1 \cdot 0^0 = 2.

The definition given here Multivariate — PyMC 5.10.0 documentation might has a typo since the last subscript k probably belongs to \theta like that:

f(x|\theta) = \frac{\Gamma( \sum_{i=1}^k \theta_i)} {\prod_{i=1}^k \Gamma\left( \theta_i\right)} \prod_{i=1}^{k-1} x_i^{\theta_i-1} \left(1 - \sum_{i=1}^{k-1} x_i \right)^{\theta_k}

This would make sense because it states

Only the first k-1 elements of x are expected.

and since it should hold that \sum_{i=1}^k x_i = 1 the term in the brackets should equal x_k. But when tested

np.exp(pm.Dirichlet.dist(a = np.array([2, 1])).logp(np.array([0.5, 0.5])).eval())

returns 1.0 which is correct but

np.exp(pm.Dirichlet.dist(a = np.array([2, 1])).logp(np.array([.5, 0])).eval())

returns 0.

So I assume whenever there is a 0 in x it will return 0 even if the corespoding value in a is 1.

gBokiau · June 22, 2018, 7:56pm

There’s a good chance this is solved in the master, as I don’t think my commit is part of a release yet. Would you mind trying it out with the dev version @Dominik? And report an issue if it’s failing with that as well (again, don’t think it will, but who knows).

Dominik · June 22, 2018, 8:20pm

I installed git+https://github.com/pymc-devs/pymc3 but the results seem to be the same. Does this code work in your version:

import numpy as np
import pymc3 as pm
np.exp(pm.Multinomial.dist(n = 1, p = np.array([0, 1])).logp(np.array([0, 1])).eval())

Should I use a different branch?

gBokiau · June 22, 2018, 8:36pm

Thanks for trying… no that means there’s more to it. I don’t have my setup in front of me, but it probably wouldn’t work either. Will open an issue.

gBokiau · June 22, 2018, 8:36pm

Ah you beat me to it Thanks!

Dominik · June 22, 2018, 8:38pm

You are welcome! Thanks for the support

junpenglao · June 22, 2018, 9:09pm

FYI, scipy also returns nan for:

st.multinomial.pmf(np.array([0, 1]), 1, np.array([0, 1]))

And for dirichlet scipy return error for array containing 0:

ValueError: Each entry in 'x' must be greater than zero.

gBokiau · June 22, 2018, 9:22pm

This is problematic in multinomial/softmax regression scenario’s though, for which you definitely would need p to support 0.
I can’t immediately think of a scenario with Dirichlet… but by analogy I would presume the same holds?

junpenglao · June 22, 2018, 9:32pm

Agree about multinomial. But Dirichlet’s support is (0, 1).

Dominik · June 26, 2018, 8:33am

I would argue that a support of p=0, x=0 in the multinomial case implies a support of p=0, a=1 in the Dirichlet case. If f(x|p) is the multinomial distribution and f(p) is uniform on the simplex then by Bayes theorem

f(p|x) = \frac{f(x|p)f(p)}{f(x)} = f_{dirichlet}(p|a)

for a = x+1. Hence there should be support for a=1.

In particular for my application I need to draw from different Dirichlet distributions, all with the same shape, but sometimes a given category i has no members a_i=1. However one could achieve the same result by removing this category from the vector a and append p_i=0 to the result afterwards.

Scipy also does some other funny stuff. E.g. scipy.stats.lognorm.pdf(0, 1) returns 0.0 even know the definition includes a term 1/x and hence should not support x=0. Nevertheless this continuous continuation can be useful

Edit: Corrected x_i=0 to p_i=0.

junpenglao · June 26, 2018, 10:34am

This is a good point. Would you like to send a pull request on Github to fix this?

Dominik · June 26, 2018, 10:41am

I can try, but i am not very familiar with theano yet

Dominik · June 26, 2018, 2:14pm

Done: https://github.com/pymc-devs/pymc3/pull/3059

Topic		Replies	Views
Problem with pymc3.Multinomial() distribution? Questions	3	570	July 14, 2020
Multinomial distribution doesn't behave as expected Questions	2	586	March 28, 2019
Implementing the DirichletMultinomial distribution in PyMC3 Development development	42	1805	January 18, 2021
PyMC v5.14.0 introduces error in MultivariateNormal with NaN v5 development , bug , modeling	3	130	May 14, 2024
FloatingPointError: NaN occurred in optimization in HMM Questions	5	1078	November 18, 2021

Multinomial returns nan

Related topics