So it seems that initial parameters evaluate to a mu that is pretty much zero, and hence the error. I will play with better priors with prior predictive distribution, but to test the theory I simply added a small constant to mu and it fit perfectly.
And yes, model.debug(verbose=True) came in handy.
Thanks for the help!