Depends on your model. From the data generation, it seems that it would be best to model it as two linear regression with a latent break point. Currently, you are modelling the mixture weight as a linear function of X also, which is quite difficult and not at all realistic to your data generation process.
When you are using lots of component (large K), the stick-breaking is almost summed to 1.