This reminds me of another post here
You can have a look at the notebook https://gist.github.com/junpenglao/1907bf019906c125f63126ec4bf23880#file-mixture_discourse-ipynb
I think extending the model above into a softmax should solve the problem.
This reminds me of another post here
You can have a look at the notebook https://gist.github.com/junpenglao/1907bf019906c125f63126ec4bf23880#file-mixture_discourse-ipynb
I think extending the model above into a softmax should solve the problem.