Unsupervised clustering: estimating number & type of subgroups

Usually, you can model this with Dirichlet process mixtures: http://docs.pymc.io/notebooks/dp_mix.html. But similar to other mixture models, inferencing these kinds of models are difficult and care must be taken.

As for the underlying distributions, I have no good answer for it as well - but unless you have a strong theoretical motivation or lots of data, the result probably indistinguishable with different distributions as long as they have a similar shape (e.g, using a Student t instead of Normal).