GSoC 2026: Streaming Inference & ADVI with API Data

Hi! I am Sanaica and I’ve been recently following the Streaming/Online Inference for GSOC 2026.

Over the last few days, I’ve been benchmarking on how PyMc handles continuous data (like API of yfianace). My goal was to bypass the graph re-compilation bottleneck that occurs when feeding real-time data.

By utilizing inherently mutable pm.Data containers alongside pm.ADVI(), I built a continuous while loop that ingests real-time inputs (tick-by-tick prices + exogenous sentiment scores). It updates the underlying parameters, and samples the predictive without stopping or recompiling.

As you can see in the benchmark graph the streaming architecture drops latency from ~12.0 seconds per update (Traditional MCMC) down to ~0.1 seconds per update.

Here is a snapshot of the live dashboard generated from the streaming architecture, showing the continuous API inputs automatically updating the Bayesian probability bands (Expected Return μ and Uncertainty ±1σ) in milliseconds:

  1. Top Panel (Price & Signals): A black line showing the Bitcoin price, overlaid with colored dots (Green for Buy, Red for Sell).
  2. Middle Panel (Whale Sentiment): A red/green bar chart showing the simulated Billionaire 13F data (-1 to 1).
  3. Bottom Panel (The PyMC Math): An orange line showing the Expected Return.

As I draft a formal GSoC proposal, I want to ensure my focus aligns perfectly with the core team’s roadmap. Would you prefer a proposal that focuses on, a formal “Streaming Adapter/Wrapper” that standardizes how PyMC ingests live Python generators or Dask streams? Or focusing strictly on the math backend like recursive Bayesian updates (using the posterior as the new prior)?

Any guidance on which is higher priority for 2026 would be appreciated!

Hello Sanaica!

I would prefer the former! Definitely share any notebooks with what you’ve been exploring!

Hi Zaxtax,

Thanks again for the reply and for giving a clear Streaming Adapter/Wrapper direction. That’s exactly what I’ll focus on.
Here’s the notebook: https://drive.google.com/file/d/1LrNZ4bH3kGCVW34sqXk1kSIhKAJcAftO/view?usp=sharing

It looks like you had a lot of fun. Could you show how you might abstract things, so we can do have a before and after? Basically show how this use-case will become easier after you do the work for this GSOC