The Why and How of One Domain-Specific PyMC3 Extension by Dan Foreman-Mackey

Talk Abstract

In this talk I will describe some of the unique challenges encountered in probabilistic modeling for astrophysics and some approaches taken to overcome these obstacles. In particular, I will discuss the motivation for and development of the domain-specific exoplanet package. This library implements a suite of custom Theano ops to evaluate astronomy-specific functions and their gradients, custom PyMC3 distributions for physically-motivated reparameterizations, and functions to help astronomers port existing habits to the PyMC3 ecosystem. exoplanet also includes an implementation of scalable Gaussian Process regression in one dimension that is generally applicable beyond astrophysics. Besides these technical details, I will also discuss some of the barriers that exist for domain scientists who are new to PyMC3, and some proposals for lowering these barriers.

Talk

Dan Foreman-Mackey

Dan is an Associate Research Scientist at the Flatiron Institute’s Center for Computational Astrophysics studying the application of probabilistic data analysis techniques to solve fundamental problems in astrophysics.


This is a PyMCon 2020 talk

Learn more about PyMCon!

PyMCon is an asynchronous-first virtual conference for the Bayesian community.

We have posted all the talks here in Discourse on October 24th, one week before the live PyMCon session for everyone to see and discuss at their own pace.

If you are available on October 31st you can register for the live session here!, but if you are not don’t worry, all the talks are already available here on Discourse (keynotes will be posted after the conference) and you can network here on Discourse and on our Zulip.

We value the participation of each member of the PyMC community and want all attendees to have an enjoyable and fulfilling experience. Accordingly, all attendees are expected to show respect and courtesy to other attendees throughout the conference and at all conference events. Everyone taking part in PyMCon activities must abide by the PyMCon Code of Conduct. You can report any incident through this from.

If you want to support PyMCon and the PyMC community but you can’t attend the live session, consider donating to PyMC

Do you have suggestions to improve PyMCon? We have an anonymous suggestion box waiting for you

Have you enjoyed PyMCon? Please fill our PyMCon attendee survey. It is open to both async PyMCon attendees and people taking part in the live session.

6 Likes

Thanks for the talk Dan! And thanks for all the work you have been doing promoting more modern inference method in the astrophysics community.

A few thoughts:

  • I would love to hear some ideas from you around how the “onboarding” to PyMC3 could be better. Currently we are rely more on pointing user to books like the PyMC3 port of Statistical Rethinking, and internally we discussed about restructuring the documentation so that it flows better.
  • I am interested to see some custom theano ops in exoplanet, and see how they integrate with theano-pymc with the Jax linker
  • Somewhat unrelated but of personal interested: I know you work quite a lot in exoplanet for custom NUTS tuning (ones that are more similar to Stan) - do you have some example that are really difficult to fit unless with a window tuning scheme?
3 Likes

Really cool talk Dan. I had no idea PyMC3 was so popular in astrophysics.

I think your areas of improvement are spot on. Regarding the second point, do you want to write that into a github issue?

Also, did you find our announcement here: https://pymc-devs.medium.com/the-future-of-pymc3-or-theano-is-dead-long-live-theano-d8005f8a0e9b clear up the confusion as to your point #3?

4 Likes

Very nice talk @dfm. Many good point that I feel I would enjoy rewatching it!

Do you have some problems that could benefit from an Approximate Bayesian Computation approach? I would like to test our implementation with some real problems to both better understand the method/implementation and also try to find ways to improve it.

Thanks for the kind words and questions @junpenglao, @twiecki, and @aloctavodia!

Re: on-boarding: I don’t know the answers, but it is something I’d love to brainstorm about more. Perhaps elsewhere on this board or as a github issue?

Re: theano-jax: I’m very excited about this development and I think it will make future-proofing my software a lot easier. I’ve made some progress making custom JAX ops (e.g. https://github.com/exoplanet-dev/celerite2/pull/9) and I’m hoping to write a blog post soon that tests out some of the newest features with some astro specific ops. It wasn’t trivial to get up and running with the steps needed to port the two (in particular, I’ve only previously had to care about the “transpose” rules, rather than a traceable “jvp” ops) but I think I’ve learned enough to make it all work!

Re: custom tuning: I don’t have examples where adapt_full doesn’t work and my custom routines are, but in most of my astro test cases, the implementations in my PyMC3 Extras package (based loosely on the Stan methods), tend to be slightly to significantly faster during warmup. I definitely haven’t benchmarked this properly!

Re: ABC: This is a great question and I think that a huge fraction of astronomy is well suited to ABC! I was hoping to focus more on looking into that in my research, but I’ve been distracted recently :smiley: I’d love to chat more!!

3 Likes

Thank you very much for the very nice presentation. I am very happy your extensions will be available as pymc keeps evolving.

I would like to emphasize what @dfm mentioned during his presentation: In many scientific projects, the theoretical model must handle external data grids before it can be evaluated against the observations (see exoplanet interpolator). The “PyMC3 and Theano” documentation starts describing that: “for the most part you don’t need detailed knowledge [of theano]” but in these cases, you do and it is hard without both high programming and statistics expertise. This is true even in popular libraries with many examples.

I hope that for the next PyMCon, we can be all together discussing face to face :slight_smile:

1 Like