So as a pymc3 user, the initialization recipes in
pymc3/sampling.py are really useful for getting started. But often I’ll want to customize / change a certain kwarg for one of those intermediate steps (i.e., the optimizer used for ADVI in
init_advi, the minimizer for
find_MAP) which aren’t exposed from the main
sample(*args, **kwargs) API.
I’m then left trying to re-create some of those recipies in my main script so I can access those features, but that’s cumbersome and requires a bit of hunting around for the right inputs.
We could certainly expose additional workings to the main sample API, but I’m not sure that’s the best option. scikit-learn has a similar problem and has solved it really nicely with their
Pipeline class, which allows you to chain together different methods while customizing their hyperparameters. I.e.,
Pipeline([('anova', anova_filter), ('svc', clf)])
What we’re really building in
sample.py (and changing with the different
init arguments) is an ‘inference pipeline’, with various initialization, approximation, and sampling methods.
It would be awesome if we could do something similar in pymc3, and build up our inference algorithm with something like
pm.Pipeline([ ('map', pm.find_MAP(...)), ('advi', pm.AVDI(n=50000, optimizer=...)), ('adapt_diag', pm.some_fn()), ('hmc', pm.NUTS(...))
It would be 1) more customizable, instead of continuing to accumulate recipes in
sample(init='...'), and 2) would allow you to go back and inspect, for instance, how closely the advi fit came to approximating your sampler.
Just a thought, I’m sure it would take a pretty concerted effort to re-engineer the API.