I suspect that the usage of the for-loop here is problematic - try to vectorize everything if you can. I suspect that you could rework this so that the shape of y_obs is (n, t, p) or (t, p, n) so that you could avoid instantiating n instances of the MatrixNormal object.
Also, it looks like you have a discrete latent variable in this model. Since these require a sampler appropriate for discrete variables like the binary-optimized Gibbs sampler, the samples may not converge to the posterior as rapidly as for a model with only continuous latent variables.