You are asking about estimating the number of clusters within a data set. That is complicated; there is no simple way to do that. But I would continue reading material on the usage of pymc3 and bayesian statistics before you worry too much about this advanced material. Try and get through more chapters of the book and things will clear up a bit.