How Figaro (Scala) / Anglican are different from PyMC3/Stan?

nanounanue · September 6, 2018, 3:28pm

I just posted this in stackoverflow and the question was rejected given that was too broad. I am searching for answers to this, so, I help this community could help me:

I just started to learn PyMC3, and in general Probabilistic Programming, but after some exploration I have the folllowing question:

Obviously the first thing is that the language in the first 3 are different (Scala, Clojure, Python). The second difference is that (at least) from the examples around the web and books PyMC3 and Stan are most used in Bayesian Hierarchical Models/ Bayesian Inference and the examples (very few) from the former two are (apparently) more general.

In the book Practical probabilistic programming the author describes a
Probabilistic reasoning system and he describes it as capable of doing: (a) Predict future events, (b) Infer the cause of effects and © Learn from past events to better predict future events (pages 8-9), and he gives examples about this in the book using Figaro. Are PyMC3 and Stan capable of doing this? Are there examples of this? The same goes to Anglican, in several talks (including this one) they mention the use of Anglican for systems that shown the capabilities expressed above (in this case for a physical simulation in oil sea platforms).

Maybe a better way of asking this is possible to implement all the examples of the forementioned book in PyMC3 or Stan?

junpenglao · September 7, 2018, 9:09am

I think the recent thesis by Tom Rainforth gives a pretty good description, it is a bit bias in favor towards Anglican because he’s from Frank Wood’s team. See Chapter 4:

nanounanue · September 7, 2018, 3:38pm

@junpenglao This is awesome, thank you so much!

cscherrer · September 15, 2018, 5:18pm

Hi @nanounanue, I just saw your question. I’ve used Stan and PyMC3 a lot, and I served as technical lead on the evaluation team for PPAML, the DARPA program that funded Figaro and Anglican. I like @junpenglao’s suggestion of checking out Rainforth’s thesis, but I wonder if some discussion could be helpful as well.

There’s an analogy I’ve found helpful in thinking about how these different systems. Say someone vaguely asks you to “drive them to a thing”, and you get to choose between a luxury sedan and one of those hardcore off-road buggy things (there must be a term for those, I’ll go with “buggy”) like they used in Westworld. Which is the better choice?

If you need to be able to go absolutely anywhere, you should take the buggy. If it’s in-town driving, the buggy will be fun for a while and then probably get tiring. And if it’s a highway road trip, it will not be nearly as enjoyable.

By constraining the problem space (“normal” road driving), the sedan can do a better job when it’s in its preferred space. That doesn’t mean the sedan is “better”, or that you couldn’t use the buggy for absolutely everything.

Ok, back to the point. All of the systems you mention satisfy Avi’s description. “Universal” probabilistic programming (as in Anglican) is the buggy. It can represent any model you throw at it. Stan is the other extreme, limited to a fixed-dimensional parameter space over differentiable distributions. Stan can’t represent nearly as many models. Things like mixture models are a bit awkward, because you have to jump through some hoops to sum over the discrete values (weird that they haven’t automated that). But when you do sum them out, you get the benefit of “Rao-Blackwellization”, making inference more efficient. And it generates efficient C++ code, so (after you get past compiler overhead) inference is very fast.

Well, that almost does it. There’s still the potential for general-purpose systems like Anglican to recognize (automatically or through user input) that the model is constrained, and to go with a specialized algorithm. In principle, a universal PPL can give the best of both worlds. In practice, this can come down to the time and energy of the development team, characteristics of the language they’re using for development, and the degree to which they can leverage external tools.

In terms of “representable models”, it’s something like
Stan < PyMC3 < Figaro < Anglican
(Figaro has some advantage over PyMC3 because it allows an “open world” - you can introduce new variables as you go).

Overall , these are all great, and I think the primary decision I would use to decide is what language you prefer to work in:
C++, command line, or R -> Stan
Python -> PyMC3
Scala -> Figaro
Clojure -> Anglican

_eigenfoo · September 18, 2018, 4:27am

@cscherrer That was an amazing overview, thank you!

cscherrer · September 18, 2018, 4:43am

That’s very kind, thank you. Glad it was helpful

Topic		Replies	Views
User experience: Python vs R, PyMC vs Stan, PyTensor vs JAX	30	1384	January 27, 2025
A detailed comparison between PyMC and Stan syntax and approaches to modelling Sharing	1	376	January 23, 2025
Smooth Transition to PyMC3 v5	1	154	June 24, 2024
Different values between pymc3 and stan version agnostic modeling	2	633	March 31, 2022
Pymc3 produces different results than Stan/NumPyro v3	8	1235	March 15, 2022

How Figaro (Scala) / Anglican are different from PyMC3/Stan?

Related topics