Contribute to PyMC4

rlouf · December 24, 2019, 7:38am

Hi everyone,

I will soon have some time in my hands; I thought about rolling my own framework based on Rust, but after some reflexion I am not sure it will bring something new (perhaps clarifying my thoughts about what a good probabilistic language should look like), and I think it is an unnecessary a duplication of efforts. I am just going to throw here a few things I would have liked having when I was using PyMC3, and maybe some of these you would like me to investigate and try to integrate in the library:

Iterative sampling a-la deep learning. I may have missed a functionality, but I have always found the fact that inference needed to proceed in batches frustrating. I have a little experience with deep learning, and the ability to plug in tensorboard to get metrics while training is just so useful. There may be a good reason why we don’t want that here, but being able to sample iteratively would be a big plus for me.
Being able to re-start inference from a checkpoint. Again, might have missed something.
Continual learning. I know this is a tough one, but isn’t this the big claim of Bayesian statistics?

Let me know if there is something you find particularly interesting/that you would like to integrate in the library. I’m also happy to help with something else.

junpenglao · December 24, 2019, 9:26am

Hi @rlouf,
Love your work on transformer - are you looking to integrate it with some Bayesian source?

As for your question itself, I think these are certainly functionary that could be added, but lower level control probably makes more sense to do it in TF/TFP itself. FWIW, you can use PyMC4 to generate the log_prob/loss function, and plug into an inference workflow of your choice.

As for iterative sampling, do you mean being able to do a for loop that train and fetch metrics at the same time?

rlouf · December 24, 2019, 9:48am

Hi,

Thanks!

Yes. I know that it doesn’t make sense for small models / datasets but I’ve had to sample big models in the past and being able to compute metrics while training would be have been super useful. I also suspect that as the methods become more widespread people will want it run deep-learning-like super long trainings in which case being able to monitor the model may be useful.

I started writing an inference « machine » in Rust that is an iterator over samples and the flow feels pretty nice. Being a low-level language this adds very little overhead. Maybe it is implemented like this in TFP already?

Something else I think is interesting in terms of big models is minibatch inference. But again, not sure this is something you want to have in an high-level library right now.

I’m also happy to help with anything (code, docs, examples).

Gon_F · December 26, 2019, 7:50pm

These are just my thoughts, but I always thought State Space Modeling was fascinating and a fundamental way to do time-series statistics, and I would be willing to help out with the coding for any effort going in that direction.

Topic		Replies	Views
Any updates on PyMC4 and the usage with tensorflow_probability? Questions	1	3480	March 12, 2019
Alternative Computation Backends for PyMC PyMC4	5	2291	June 7, 2018
Regarding support for more than one sampler project in PyMC4 Development	5	687	February 24, 2020
SMC -ABC and further projects at PYMC3 Development	9	1206	March 4, 2019
Design of PyMC4 and PPL frameworks Sharing development	1	783	October 4, 2019

Contribute to PyMC4

Related topics