How to cluster N x K dataset?

Hi,

after looking Unsupervised Clustering using a Mixture Model in Probabilistic-Programming-and-Bayesian-Methods-for-Hackers, which used mixture model to do clustering on N x 1 data. I wonder how to apply this idea to an N-K dataset? Like how to cluster a data with K features instead of just one? All examples and documentation I can find online are just clustering on one dim vector. I know Edward can do this easily, like in this post, the author used Edward to do clustering on Iris data with all four features, but when it comes to pymc, the author just used one feature again… Any hints or resources are appreciated.

You can have a look at https://docs.pymc.io/notebooks/lda-advi-aevb.html?highlight=dirichlet and https://github.com/junpenglao/Planet_Sakaar_Data_Science/blob/master/WIP/[WIP]%20Bayesian%20GMM.ipynb

1 Like