[PyMCon Web Series] The Power of Bayes in Industry: Your Business Model is Your Data Generating Process (Feb 9, 2023)

The Power of Bayes in Industry: Your Business Model is Your Data Generating Process

Speaker: Dante Gates

Event type: Live webinar
Date: Feb 9th, 2023 (subscribe here for email updates)
Time: 21:00 UTC (4pm EST)
Register for the event: Meetup event or Zoom
Notebook: On Colab

NOTE: The event will be recorded. Subscribe to the PyMC YouTube for notifications.

Welcome to the first event of the PyMCon Web Series! As part of this series, most events will have an async component and a live talk.

In this case, Dante, as part of the async component, prepared a Colab notebook for the community to engage in before the talk. Run it and answer the questions Dante left for discussion:

  • What is your favorite example of a Data Generating Process (DGP) / first principles model?
  • Have you applied the ideas in this post in industry?
  • What are some of the benefits we missed?

Abstract of the talk

This talk will attempt to answer the question “what is a Data Generating Process and why does it matter?” While we will begin our discussion with a bit of theory, don’t worry about this being too technical or inaccessible if you’re new to Bayesian Statistics. Our primary goal is to focus on the second half of the question and give you tools to use for real-world applications.

With the core concepts and background covered, we’ll demonstrate how incorporating this understanding into our modeling decisions allows us to embed elements of a business function directly into our statistical models and how this can provide immense value in industry settings, especially where traditional machine learning techniques fail, such as

  • The ability to tackle critical problems when data is lacking, like launching a new product

  • Building powerful, predictive models that are difficult to overfit

  • Explainability is built in, and it’s already expressed in the terms of your business

Best of all is that the design techniques we propose here are such that when you get one the benefits above, the rest usually come for free.

All of this and more will be illustrated through concrete examples found in both publicly available data as well as proprietary data we use here at Perpay.

4 Likes

Sounds great! How can I register for the event?

2 Likes

Thanks for the excellent colab notebook, it definitely cast the putting example in newer light.

I hope this is the right place to share my answers to the 3 questions:

  • Reasoning about how demand for a particular product/commodity evolves over time as a function of physical events that lead to conversion
  • Reasoning about the behavior of a distributed computing system (for eg kafka) which displays a mix of deterministic (software behaves deterministically, most of the time), and probabilistically (under heavy loads/spikes a distributed compute environment will behave stochastically, even though the behavior of the software itself is deterministic at an individual node level)
  • Several times, in principle, and in several domains, but without formalizing a DGP in code. Been looking to find the right software toolkit to be able to do this in a repeatale and reproducible manner and have been getting tripped up by a lack of understanding of how to model complicated multi-time step PGMs. Also find it difficult to express hybrid DGPs which are a mix of deterministic and probabilistic links using existing tooling.
  • Using a DGP to run realistic simulation on a business model can inform experimental design for estimating specific sensitivities of business metrics to policy variables
  • This can also plug into programmatic recommendations for business strategy pivots
2 Likes

Hi Gireesh!

We will send the link the week before the event to the mailing list. Edited the main post to make it more clear! We are not sharing it publicly to avoid zoombombing. Also you can subscribe to the Meetup event.

1 Like