Potential GSoC 2026 Project Idea

Hey everyone! I introduced myself earlier but again - i am Adishree Bharadwaj, a CS student from India looking to contribute to PyMC for GSoC 2026.

i’ve been researching PyMC’s application in finance and social systems and noticed this gap around Changepoint Detection. i’ve also seen this be on the PyMC radar for a few years and wanted to propose a specific angle that hasn’t been tackled yet. so, the core problems as I see them:

  • users have to fix the number of changepoints before sampling, rather than learning it from data
  • changepoint locations as discrete variables make MCMC sampling slow and unstable
  • automatic marginalization helps but breaks down at scale

my proposal is to build a module in pymc-extras for Bayesian changepoint detection. the user provides their data, likelihood type, and optionally a rough idea of how many changepoints to expect - the module takes care of everything else. i’d also include two worked examples on real public datasets, one for financial regime detection and one for a social systems use case.
i’m aware marginalization becomes intractable for very large numbers of changepoints and would document these boundaries honestly rather than pretend there’s a solution that doesn’t actually exist.

Would love to know if this direction interests any potential mentors, and whether there’s anything i am missing or misunderstanding about the current idea. thanks!

Are you aware of specific approaches to solve this problem?

hey, thanks for the question. after reflecting more, i think i’d like to redirect my focus to another project idea that aligns better with my current skills. which i will bring up in a new conversation really soon!