Numerical Issues with StickBreaking in ADVI

I have no insight of why exactly VI would gives such horrible result. I suspect it is something to do with multi modal as well (for component) or a local minima.
For what is worth, reduce the number of training epoch and changing to another optimizer might help:
[Edit]: this doesnt seems to be a good solution, see below post.

import pymc3 as pm
import numpy as np
import theano
import theano.tensor as tt
from pymc3.distributions.transforms import t_stick_breaking

np.random.seed(1)
nd = 10
sample = np.random.randint(0, 10000, nd)
def mix(components, decomp):
    return tt.dot(decomp, tt.nnet.softmax(
        tt.horizontal_stack(tt.zeros((nd, 1)), components)))
    
with pm.Model() as model:
    decomp = pm.Dirichlet('decomp', np.ones(10), shape=(1, 10),
                         transform=t_stick_breaking(1e-9))
    components = pm.Normal('components', shape=(nd, nd-1))
    combined = pm.Deterministic('combined', mix(components, decomp))
    obs = pm.Multinomial('obs', np.sum(sample), combined, observed=sample)
    mean_field = pm.fit(method='advi', n=int(1e4), obj_optimizer=pm.adam(),
                        progressbar=False)
decomp = mean_field.bij.rmap(mean_field.mean.get_value())

print(theano.config.floatX)
print(t_stick_breaking(1e-9).backward(decomp['decomp_stickbreaking__']).eval())

Note I also did some refactoring to make sure softmax doesnt make the model unidentified.