After sampling, we compute diagnostic of the MCMC chains (effective sample size and Gelman–Rubin convergence rhat), which currently could be quite slow if you have a lot of samples (10K in this case). You can switch it off by doing pm.sample(..., compute_convergence_checks=False), but I would recommend you to instead sample with NUTS and use less samples.
As for profiling, see https://docs.pymc.io/notebooks/profiling.html