Regardless of the colab setup: as far as I know/have read most PyMC3 models generally do not see an increase in (sampling) speed when using a GPU instead of CPU.
My experience is that, when using the GPU, I need to set cores=1 when sampling.
See also: