I am trying to use PyMC3 to fit the spectra of galaxies. The model I use to fit the spectra is currently described by four parameters. At present, I am trying to fit simulated spectra (i.e., data) to assess (a) how reliably PyMC3 is able to constrain the known model parameters and (b) how quickly it converges.

All the parameters in my model are continuous, so I’m using the NUTS sampler. When I only fit a single parameter (i.e., fix the other three at the known truth values) the sampler runs quickly (advi stage at ~5000 it/s, sampling stage at ~1000 it/s). However, the rate of sampling falls dramatically as I add more free parameters. For example, when sampling all four parameters the advi stage still runs fairly quickly (at ~3000 it/s), but the sampling stage falls to ~20 it/s.

By contrast, if I use Metropolis to sample all four parameters, I get a sampling rate of ~1000 it/s.

The fact that it samples quickly with a single free parameter and Metropolis suggests to me that the low sampling rate is not due to each model evaluation taking a long time. Instead, I was wondering whether it could be related to NUTS’s gradient calculation.

I was just wondering whether this sounds like normal behaviour (i.e., slow sampling with just four model parameters) and if you have any advice on how I could speed things up (I’d like to use this to fit the spectra of ~tens of thousands of galaxies, so speed is fairly important).

Code available at:

https://github.com/SheffAGN/SEDist (see test.py).

Thanks for your help, and keep up the good work!