So as a pymc3 user, the initialization recipes in pymc3/sampling.py
are really useful for getting started. But often I’ll want to customize / change a certain kwarg for one of those intermediate steps (i.e., the optimizer used for ADVI in init_advi
, the minimizer for find_MAP
) which aren’t exposed from the main sample(*args, **kwargs)
API.
I’m then left trying to re-create some of those recipies in my main script so I can access those features, but that’s cumbersome and requires a bit of hunting around for the right inputs.
We could certainly expose additional workings to the main sample API, but I’m not sure that’s the best option. scikit-learn has a similar problem and has solved it really nicely with their Pipeline
class, which allows you to chain together different methods while customizing their hyperparameters. I.e.,
Pipeline([('anova', anova_filter),
('svc', clf)])
What we’re really building in sample.py
(and changing with the different init
arguments) is an ‘inference pipeline’, with various initialization, approximation, and sampling methods.
It would be awesome if we could do something similar in pymc3, and build up our inference algorithm with something like
pm.Pipeline([
('map', pm.find_MAP(...)),
('advi', pm.AVDI(n=50000, optimizer=...)),
('adapt_diag', pm.some_fn()),
('hmc', pm.NUTS(...))
It would be 1) more customizable, instead of continuing to accumulate recipes in sample(init='...')
, and 2) would allow you to go back and inspect, for instance, how closely the advi fit came to approximating your sampler.
Just a thought, I’m sure it would take a pretty concerted effort to re-engineer the API.