Implementing Softmax Regresion


I’m trying to implement softmax regression using PyMC3. In this github thread @twiecki presented follwing implementatio as the correct softmax implementation using PyMC3

iris = sns.load_dataset("iris")
y_2 = pd.Categorical(iris['species']).labels
x_n = iris.columns[:-1]
x_2 = iris[x_n].values

import sklearn.preprocessing
y_2_bin = sklearn.preprocessing.LabelBinarizer().fit_transform(y_2)

with pm.Model() as modelo_s:
    alfa = pm.Normal('alfa', mu=0, sd=10, shape=3, testval=np.random.randn(3))
    beta = pm.Normal('beta', mu=0, sd=10, shape=(4,3), testval=np.random.randn(4, 3))
    mu = alfa +, beta)
    p = pm.Deterministic('p', tt.nnet.softmax(mu))
    yl = pm.Categorical('yl', p=p, observed=np.bool8(y_2_bin))
    step = pm.Metropolis()
    trace_s = pm.sample(5000, step)

However, I get an error saying,

IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (150,) (150,3)

when I try to execute the script. Can someone please help me to understand how to fix this error?


Categorical RV takes 1d array as observation:

yl = pm.Categorical('yl', p=p, observed=np.where(y_2_bin)[1])


So does it figure out the mapping between 2-D array p and 1-D labels np.where(y_2_bin)[1] internally?


Yep, it use the data to index on the first dimension of p