Progress bar says sampling is finished, but keeps running for a long time?

Newbie question here, I’ve made my first pymc3 model and am getting MCMC sampling to run. I get the following output, the progress bar completed fairly quickly but then it sits there doing something for a very long time before completing the operation.

Is there some post processing step that it is doing that could be avoided (or deferred) or is the progress bar just not particularly useful?

In general, what is the pymc3 way to profile models to find the bottlenecks? I think I’m doing some matrix manipulations in the model in a sub-optimal way, but I’m not sure how to profile the the model run to find where to spend time optimising.

After sampling, we compute diagnostic of the MCMC chains (effective sample size and Gelman–Rubin convergence rhat), which currently could be quite slow if you have a lot of samples (10K in this case). You can switch it off by doing pm.sample(..., compute_convergence_checks=False), but I would recommend you to instead sample with NUTS and use less samples.

As for profiling, see

1 Like

I am finding that even with the compute_convergence_checks=False flag, I experience the same behavior. The progress bar indicates that the sampling is done, but something continues to run…

Screen Shot 2020-04-26 at 03.10.20

It turns out after profiling that in this case, drawing the traceplot was what was taking the time!

Yeah, that’s what I was gonna suggest. You can use az.plot_trace(trace, compact=True) when your model has many multi-dimensional parameters.