Numerical Issues with StickBreaking in ADVI

I think the problem may arise from the model itself. The theoretical mean of the decomp is [0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1] but the distribution is multimodal and hence cannot be approximated well with the default transformation used by ADVI (presumably StickBreaking). The multimodality comes from the fact, that the extreme observation sample cannot be explained well by either of the components. In turn, it is more likely for one single component to assume a very unlikely state that approximates sample instead of all of them.

I could do one of these three things to resolve the issue:

  • use a different parameter distribution in the variational inference, e.g. the particle swarm in svgd.
  • find an alternative StickBreaking transformation that maps decomp to a monomodal distribution that can be approximated well with a normal distribution
  • change the model to better explain the observation sample