I’m trying to implement softmax regression with a Gaussian latent representation. The issue I’m facing is purely technical, rather than theoretical, but here is a brief mathematical introduction (you might as well skip it and go straight to the the MCVE implementation).
Consider a D-part probability vector p. Since vector p is constrained to an Aitchison simplex \mathcal{S}^{D}, we can map p onto an orthonormal basis in \mathbb{R}^{D-1}.
Where \Omega is a contrast matrix of D-1 rows and D columns. This transformation is an isometry on \mathcal{S}^{D}, therefore we can parametrise p using a latent Gaussian model, e.g.
where \alpha, \beta and \sigma are (D-1)-dimensional vectors and x is a predictor variable.
Here is an MCVE implementation
import numpy as np
import theano
import pymc3 as pm
def closure(x):
return x / x.sum(1)[:, None]
samples = np.array(
[[0.18, 0.2 , 0.21, 0.17, 0.23],
[0.2 , 0.2 , 0.2 , 0.17, 0.23],
[0.2 , 0.2 , 0.2 , 0.18, 0.21],
[0.18, 0.19, 0.2 , 0.16, 0.27],
[0.18, 0.21, 0.21, 0.19, 0.21],
[0.19, 0.19, 0.19, 0.17, 0.26],
[0.2 , 0.2 , 0.19, 0.18, 0.23],
[0.2 , 0.2 , 0.2 , 0.17, 0.23],
[0.17, 0.18, 0.17, 0.18, 0.29],
[0.18, 0.18, 0.17, 0.17, 0.29]]
)
omega = np.array(
[[ 0.55, 0.55, -0.37, -0.37, -0.37],
[ 0. , 0. , -0.41, -0.41, 0.82],
[ 0. , 0. , -0.71, 0.71, 0. ],
[-0.71, 0.71, 0. , 0. , 0. ]]
)
x = theano.shared(np.arange(10))
def closure(x):
return x / x.sum(1)[:, None]
with pm.Model() as model:
alpha = pm.Normal('alpha', mu=0, sd=1, shape=omega.shape[0])
beta = pm.Normal('beta', mu=0, sd=1, shape=omega.shape[0])
sigma = pm.HalfNormal('sigma', sd=1, shape=omega.shape[0])
mu = x * beta + alpha
latent = pm.Normal('latent', mu=mu, sd=sigma)
probs = closure(pm.math.exp(pm.math.dot(latent, omega)))
y = pm.Categorical('y', p=probs, observed=samples)
trace = pm.sample()
Unfortunately, I get raises the following error.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-0853e19afaf7> in <module>
24 sigma = pm.HalfNormal('sigma', sd=1, shape=omega.shape[0])
25 mu = x[:, None] * beta + alpha
---> 26 latent = pm.Normal('latent', mu=mu, sd=sigma)
27 probs = pm.math.exp(latent) / pm.math.exp(latent).sum(1)
28 y = pm.Categorical('y', p=probs, observed=samples)
...
TypeError: For compute_test_value, one input test value does not have the requested type.
The error when converting the test value to that variable type:
Wrong number of dimensions: expected 0, got 2 with shape (10, 4).
As far as I can tell, you can’t pass 2D arrays to parametrise stochastic variables, unless they are observable. I can easily solve the problem by moving the softmax part out of the model
with pm.Model() as model:
alpha = pm.Normal('alpha', mu=0, sd=1, shape=omega.shape[0])
beta = pm.Normal('beta', mu=0, sd=1, shape=omega.shape[0])
sigma = pm.HalfNormal('sigma', sd=1, shape=omega.shape[0])
mu = x[:, None] * beta + alpha
latent = pm.Normal('latent', mu=mu, sd=sigma, observed=(np.log(samples) @ omega.T))
trace = pm.sample()
posterior_samples = pm.sample_posterior_predictive(trace, samples=100, model=model)
latent_hat = np.mean(posterior_samples['latent'], axis=0)
y_hat = np.exp(latent_hat @ omega) / np.exp(latent_hat @ omega).sum(1)[:, None]
, but I’d like to have it in the model itself.