Oh yeah that makes total sense. The Mixture of Normals treats each observation independently and as such marginalizes over separate latent categorical variables.
The first non-marginalized model would be more akin to a mixture of 5 multivariate normal components comp_dists=[MvNormal.dist(mu=np.full(10, i), np.eye(10)) for i in range(5)]
I don’t recommend testing that out in v3 as multivariate mixtures were a bit flaky back then.