Multinomial probabilities don't add to 1

Hello,
I am working on converting this notebook to PyMC v5 (from v4). I am running into an issue with code block 11.59. This is the code:

N = 500

# simulate family incomes for each individual
family_income = np.random.rand(N)

# assign a unique coefficient for each type of event
b = np.array([-2.0, 0.0, 2.0])

p = softmax(np.array([0.5, 1.0, 1.5])[:, None] + np.outer(b, family_income), axis=0).T

career = np.asarray([np.random.multinomial(1, pp) for pp in p])
career = np.where(career == 1)[1]

with pm.Model() as m11_14:
    a = pm.Normal("a", 0.0, 1.5, shape=2)  # intercepts
    b = pm.Normal("b", 0.0, 1.0, shape=2)  # coefficients on family income

    s0 = a[0] + b[0] * family_income
    s1 = a[1] + b[1] * family_income
    s2 = np.zeros(N)  # pivot
    s = pm.math.stack([s0, s1, s2]).T

    p_ = pt.special.softmax(s)
    career_obs = pm.Categorical("career", p=p_, observed=career)

    idata_11_14 = pm.sample(1000, tune=2000, target_accept=0.9, random_seed=RANDOM_SEED)

However, the model fails to sample.

Logp initial evaluation results:
{'a': -2.79, 'b': -1.9, 'career': -inf}
You can call `model.debug()` for more details.

When I call m11_14.debug() I can see that p_ doesn’t add up to 1

point={'a': array([0., 0.]), 'b': array([0., 0.])}

The variable career has the following parameters:
0: Softmax{axis=None} [id A] <Matrix(float64, shape=(500, 3))>
 └─ Transpose{axes=[1, 0]} [id B] <Matrix(float64, shape=(500, 3))>
    └─ Join [id C] <Matrix(float64, shape=(3, 500))>
       β”œβ”€ 0 [id D] <Scalar(int8, shape=())>
       β”œβ”€ Add [id E] <Matrix(float64, shape=(1, 500))>
       β”‚  β”œβ”€ ExpandDims{axes=[0, 1]} [id F] <Matrix(float64, shape=(1, 1))>
       β”‚  β”‚  └─ Subtensor{i} [id G] <Scalar(float64, shape=())>
       β”‚  β”‚     β”œβ”€ a [id H] <Vector(float64, shape=(2,))>
       β”‚  β”‚     └─ 0 [id I] <uint8>
       β”‚  └─ Mul [id J] <Matrix(float64, shape=(1, 500))>
       β”‚     β”œβ”€ ExpandDims{axes=[0, 1]} [id K] <Matrix(float64, shape=(1, 1))>
       β”‚     β”‚  └─ Subtensor{i} [id L] <Scalar(float64, shape=())>
       β”‚     β”‚     β”œβ”€ b [id M] <Vector(float64, shape=(2,))>
       β”‚     β”‚     └─ 0 [id I] <uint8>
       β”‚     └─ [[0.145023 ... 61251361]] [id N] <Matrix(float64, shape=(1, 500))>
       β”œβ”€ Add [id O] <Matrix(float64, shape=(1, 500))>
       β”‚  β”œβ”€ ExpandDims{axes=[0, 1]} [id P] <Matrix(float64, shape=(1, 1))>
       β”‚  β”‚  └─ Subtensor{i} [id Q] <Scalar(float64, shape=())>
       β”‚  β”‚     β”œβ”€ a [id H] <Vector(float64, shape=(2,))>
       β”‚  β”‚     └─ 1 [id R] <uint8>
       β”‚  └─ Mul [id S] <Matrix(float64, shape=(1, 500))>
       β”‚     β”œβ”€ ExpandDims{axes=[0, 1]} [id T] <Matrix(float64, shape=(1, 1))>
       β”‚     β”‚  └─ Subtensor{i} [id U] <Scalar(float64, shape=())>
       β”‚     β”‚     β”œβ”€ b [id M] <Vector(float64, shape=(2,))>
       β”‚     β”‚     └─ 1 [id R] <uint8>
       β”‚     └─ [[0.145023 ... 61251361]] [id N] <Matrix(float64, shape=(1, 500))>
       └─ [[0. 0. 0. ... 0. 0. 0.]] [id V] <Matrix(float64, shape=(1, 500))>
The parameters evaluate to:
0: [[0.00066667 0.00066667 0.00066667]
 [0.00066667 0.00066667 0.00066667]
 [0.00066667 0.00066667 0.00066667]
 ...
 [0.00066667 0.00066667 0.00066667]
 [0.00066667 0.00066667 0.00066667]
 [0.00066667 0.00066667 0.00066667]]
This does not respect one of the following constraints: 0 <= p <=1, sum(p) = 1
Last updated: Tue Jan 09 2024

Python implementation: CPython
Python version       : 3.11.7
IPython version      : 8.19.0

numpy     : 1.26.3
matplotlib: 3.8.2
scipy     : 1.11.4
pandas    : 2.1.4
pytensor  : 2.18.4
arviz     : 0.17.0
pymc      : 5.10.3

Watermark: 2.4.3

I suspect that I am treating the Pytensor variables incorrectly. Any guidance on this would be greatly appreciated!

By default Softmax works across all axis. You probably need axes=-1

1 Like

Thanks, Ricardo. This is exactly what I needed. Silly oversight on my part.