So you need p
and mu
has a shape of (N, 3)
, since your predictors df[['x1', 'x2', 'x3', 'x4']].values
has a shape of (N, 4)
, you need a beta with a shape of (4, 3)
to get there.
What I would do in this case:
with pm.Model() as model:
mu_common_prior = pm.Normal('common_mu', mu=0., sd=100**2)
sd_common_prior = pm.HalfCauchy('common_sd', 5)
beta = pm.Normal('beta', mu=mu_prior, sd=sd_prior, shape=(4, 3))
mu = pm.math.dot(df[['x1', 'x2', 'x3', 'x4']].values, beta)
p = T.nnet.softmax(mu)
y = pm.Categorical('y', p=p, observed=df['y'].values)