Some questions about PyMC Models in online settings

Nutpie allows you to update shared variables, using the with_data method: https://github.com/pymc-devs/nutpie/blob/34dd3d98bc0fb08b7e91971ec50642af8a9bba89/python/nutpie/compile_pymc.py#L62

That’s what I considered in one project where we wanted to do NUTS inside NUTS.

Otherwise you could consider SMC? It is supposed to be good for online learning because you can use your posterior draws as the initial point once you get more data. Getting the particles close to the posterior is a big chunk of the SMC, so if they already start there it could be faster? That’s the theory at least but I never had the chance to try it out. If the PyMC based one is too slow, you can try the one from blackjax.

Variational inference might be a good candidate.

The default PyMC samplers unfortunately are very black-boxy and there is no easy way to reuse cached function or stop/resume sampling. They are in dire need of being refactored to be more functional and less OOP-like.

1 Like