Bayesian Backpropagation

In the last couple years, there has been some work (see Parallel Bayesian Online Deep Learning for Click-ThroughRate Prediction in Tencent Advertising System or Probabilistic Backpropagation for ScalableLearning of Bayesian Neural Networks) on scaling Bayesian Deep Learning by using Bayesian Backpropagation instead of a traditional sampling method like NUTS.

I’m curious if anyone in the community has any experience/results using this approach to replace their BNNs or other traditional Bayesian models. It certainly seems interesting as the obvious drawback of current Bayesian DL approaches is the speed performance vs. traditional DL.

I’ll probably spend some time when I can trying to test it out but I just wanted to see what sort of conversation I could drum up.

P.S. to the active contributors of PyMC3, I’m a huge fan - thanks for all your work!

2 Likes

I agree with you that it would be nice to have more advanced methods for fitting deep models in a principled way. My sense is that if you want to focus on algorithms that are useful in non-Gaussian settings (perhaps unlike the second paper you linked), then you can look at either methods that smoothly transition from gradient descent-like optimization into posterior sampling (i.e. https://www.ics.uci.edu/~welling/publications/papers/stoclangevin_v6.pdf) or methods employing some sort of optimal transport formulation like the SVGD option for nonsampling inference already available in PyMC3. I am also very interested to hear what other peoples’ opinions are on this.

1 Like