- I suppose, specifying a shape in
latentmeans that any future input must follow this shape. Is this the case? - I had clearly misunderstood the distinction between
pm.Multinomialandpm.Categorical, because I thought the latter was simply a normalised variant of the former. I’ve updated the maths in my question to make it clear. Since my original data are count vectors, I’ve reformulated the problem in terms of multinomial regression by adding a shared variable with trial sizes.
def closure(x):
return x / x.sum(1)[:, None]
samples = np.array(
[[287, 319, 335, 271, 367],
[306, 306, 306, 260, 352],
[295, 295, 295, 265, 309],
[285, 301, 317, 253, 428],
[184, 214, 214, 194, 214],
[289, 289, 289, 258, 395],
[320, 320, 304, 288, 368],
[238, 238, 238, 203, 274],
[341, 361, 341, 361, 581],
[278, 278, 262, 262, 447]]
)
omega = np.array(
[[ 0.55, 0.55, -0.37, -0.37, -0.37],
[ 0. , 0. , -0.41, -0.41, 0.82],
[ 0. , 0. , -0.71, 0.71, 0. ],
[-0.71, 0.71, 0. , 0. , 0. ]]
)
c = theano.shared(samples.sum(1))
x = theano.shared(np.arange(10))
with pm.Model() as model:
alpha = pm.Normal('alpha', mu=0, sd=1, shape=omega.shape[0])
beta = pm.Normal('beta', mu=0, sd=1, shape=omega.shape[0])
sigma = pm.HalfNormal('sigma', sd=1, shape=omega.shape[0])
mu = x * beta + alpha
latent = pm.Normal('latent', mu=mu, sd=sigma, shape=(10, 4))
probs = closure(pm.math.dot(pm.math.exp(latent), omega))
y = pm.Multinomial('y', n=c, p=probs, observed=samples)
trace = pm.sample()
This model compiles, but I get the ParallelSamplingError: Bad initial energy error you’ve mentioned. My case is different from the simple normalisation you’ve linked. I use an isometry to map probability vectors onto an orthonormal basis (and model them as latent Gaussians) and back to the simplex, but I guess any type of closure triggers this issue.