GSoC 2026: Interest in Scalable Online Bayesian State Space Models - Sarthak Singhal

Hi everyone,

I’m Sarthak Singhal, a third-year Computer Science student at Dronacharya Group of Institutions, Greater Noida. I’m interested in contributing to PyMC for GSoC 2026, specifically the “Scalable Online Bayesian State Space Models” project.

Background:

  • Proficient in Python with hands-on experience in NumPy and pandas
  • Coursework in statistics, probability, and linear algebra
  • Strong interest in quantitative finance and time series analysis
  • Comfortable with Git/GitHub and collaborative development

My situation:
I’ll be honest - I’m relatively new to Bayesian inference and state space models, but I’m highly motivated to learn. I’m drawn to this project because:

  • I’m fascinated by how these models are applied in financial forecasting and algorithmic trading
  • State space models represent a crucial tool for quantitative researchers, which aligns with my career aspirations
  • The challenge of making these models scalable and online appeals to both my interest in systems optimization and statistical modeling

My learning plan:

  • Working through PyMC’s documentation and tutorials systematically
  • Reading foundational materials on state space models and Kalman filters
  • Studying the current state space implementation in PyMC’s codebase
  • Making small contributions (documentation, tests, or bug fixes) to familiarize myself with the project structure
  • Actively engaging with the community to learn from experienced contributors

Questions for mentors:

  • What prerequisites should I prioritize learning before the application period?
  • Are there specific papers or resources you’d recommend for understanding state space models in a Bayesian context?
  • What would be good first issues for someone at my level to start contributing?
  • How much prior knowledge of MCMC sampling methods is typically expected?
  • Could you share insights on the current limitations of online state space models in PyMC that this project aims to address?

I’m committed to putting in the work needed to contribute meaningfully to this project. Any guidance on getting started would be greatly appreciated!

Looking forward to learning from this community.

Best,
Sarthak Singhal






Hi Sarthak

Nice to hear you’re interested in the pymc/pytensor ecosystem. I am going through all these introduction posts and copy/pasting the same question and advice. Have you ever worked with Bayesian models before? Even if the answer is yes, seeing you take our tools and apply them to some scientific questions that you’re interested in would be the right place to start. Basically my view is that it will be impossible for you to help develop these tools if you aren’t already using them yourself. If your interests are more in the direction of deep learning, then the same advice applies but with pytensor itself.

Basically the answer to all your questions are that if you’re not actively playing with these tools and trying to learn how to use them, you’re not going to be able to contribute as a toolmaker. I would suggest that instead of trying to find an issue to close, you instead spend some time working on a notebook that shows you’re exploring and learning how to use pymc/pytensor/pymc-extras. If you are interested in statespace, there are example notebooks in pymc-extras/notebooks where you can start.