GSoC 2026: Streaming Inference & ADVI with API Data

Sanaica_Dsouza · March 12, 2026, 5:18pm

Hi! I am Sanaica and I’ve been recently following the Streaming/Online Inference for GSOC 2026.

Over the last few days, I’ve been benchmarking on how PyMc handles continuous data (like API of yfianace). My goal was to bypass the graph re-compilation bottleneck that occurs when feeding real-time data.

By utilizing inherently mutable pm.Data containers alongside pm.ADVI(), I built a continuous while loop that ingests real-time inputs (tick-by-tick prices + exogenous sentiment scores). It updates the underlying parameters, and samples the predictive without stopping or recompiling.

As you can see in the benchmark graph the streaming architecture drops latency from ~12.0 seconds per update (Traditional MCMC) down to ~0.1 seconds per update.

Here is a snapshot of the live dashboard generated from the streaming architecture, showing the continuous API inputs automatically updating the Bayesian probability bands (Expected Return μ and Uncertainty ±1σ) in milliseconds:

Top Panel (Price & Signals): A black line showing the Bitcoin price, overlaid with colored dots (Green for Buy, Red for Sell).
Middle Panel (Whale Sentiment): A red/green bar chart showing the simulated Billionaire 13F data (-1 to 1).
Bottom Panel (The PyMC Math): An orange line showing the Expected Return.

As I draft a formal GSoC proposal, I want to ensure my focus aligns perfectly with the core team’s roadmap. Would you prefer a proposal that focuses on, a formal “Streaming Adapter/Wrapper” that standardizes how PyMC ingests live Python generators or Dask streams? Or focusing strictly on the math backend like recursive Bayesian updates (using the posterior as the new prior)?

Any guidance on which is higher priority for 2026 would be appreciated!

zaxtax · March 20, 2026, 3:24am

Hello Sanaica!

I would prefer the former! Definitely share any notebooks with what you’ve been exploring!

Sanaica_Dsouza · March 22, 2026, 4:33am

Hi Zaxtax,

Thanks again for the reply and for giving a clear Streaming Adapter/Wrapper direction. That’s exactly what I’ll focus on.
Here’s the notebook: https://drive.google.com/file/d/1LrNZ4bH3kGCVW34sqXk1kSIhKAJcAftO/view?usp=sharing

zaxtax · March 23, 2026, 9:42pm

It looks like you had a lot of fun. Could you show how you might abstract things, so we can do have a before and after? Basically show how this use-case will become easier after you do the work for this GSOC

Topic		Replies	Views
GSoC 2026: Interest in Streaming inference project - Harshith gsoc	11	228	April 21, 2026
GSOC'25 (Streaming inference)	2	90	April 7, 2025
Using ADVI (with mini-batch) with streaming data Questions	18	1777	June 13, 2018
Exploring Streaming / Minibatch Inference in PyMC v5	4	71	March 26, 2026
GSoC 2026: Streaming Variational Inference for Large Datasets Development gsoc	2	55	March 24, 2026

GSoC 2026: Streaming Inference & ADVI with API Data

Related topics