I was wondering why the Metropolis routine implemented in PyMC3 does not have adaptive schemes? (I know Metropolis is not the recommended default sampler to use, but I am using it because I have an external model with 4 parameters that I use with a Theano
as_op that has no gradient).
I came across a few references (DRAM: Efficient adaptive MCMC), and a MATLAB toolbox that uses such schemes and that performs really well in my cases (= converges in a few dozen steps and then with 2000-5000 samples it’s enough), where PyMC3 fails me (= the estimated parameters are barely in the 2 sigma range after 100000s of samples).
As the author of the aforementioned toolbox puts it “the covariance matrix of the proposal distribution can be adapted during the simulation according to adaptive schemes described in the references”, and I am pretty sure this is why these methods converge really fast and sample well where the standard Metropolis algorithm implemented in PyMC3 fails me (I of course use the same external model and Gaussian priors in both).
These methods have been available for over 10 years and so I was wondering what was the reason they were not implemented in PyMC? I am no expert in the field, I am just trying to use modern tools to do parameter estimation, so I may be wrong in my interpretation of what is recommended and what is available in PyMC, but since I see that PyMC is often on the cutting edge of research, sometimes implementing methods very soon after a paper is published, I am curious .