GSoC 2026: Interest in Survival Models

Hi everyone!

My name is Niccolò, I am a PhD student in AI at Sapienza University and a researcher at the Bank of Italy. I am writing to express my interest in the Survival Models project for GSoC 2026.

I’m currently focused on building Bayesian Early Warning Systems (EWS) to predict liquidity runs. To handle this, I’ve been working with discrete-time hazard models, working on implementing custom MCMC engines in pure NumPy.

To get familiar with the PyTensor backend and how censoring is currently handled in the codebase, I am currently looking into Issue #7581 (Mention vector bounds in Censored docstrings). I plan to write a message on the repovery soon, as I familiarize myself with the codebase.

While the Wiki mentions standard parametric (Exponential, Weibull, Log-Normal) and Cox models, is there interest in expanding the scope of this new module to natively support discrete-time recurrent event survival models (like handling risk-set masking over time-series)?

Are there specific architectural design docs or PyTensor modules you recommend I study before drafting my formal proposal?

Looking forward to hearing from you
Best,
Niccolò

Quick update: I commented on pymc-devs/pymc#7581. I’ve seen there’s already an in-progress PR with a couple of review notes. If you agree, I can implement the changes and open a small PR to improve the pm.Censored docstring (vector bounds + inf example). Mention vector bounds in Censored docstrings · Issue #7581 · pymc-devs/pymc · GitHub

In parallel, I’m still very interested in the Survival Models GSoC 2026 project. I’d love to discuss it further before drafting the proposal.

Best,
Niccolò

Hi Niccolo

The financial application sounds interesting. Censoring is handled in pymc, as you discovered in the linked issue. Pytensor is more in the background, and should “feel” like numpy. There are some pure pytensor example notebooks here, and some intro videos here (video 1 in a series) and here if you want to understand it more deeply.

For survival models specifically, it would be best to start by going over the survival model examples in pymc-examples. Try to figure out what tools exist and what is missing for the specific models you have mind. Then your proposal can be a focused “I want to implement specific feature x, so that I can accomplish goal y, which is currently not possible.”

Hi Jesse,
I took your advice and spent some time reading the survival notebooks and trying to understand what is already there.
One thing I noticed is that the GSoC wiki explicitly mentions truncation, but if I am correct I have not seen it clearly in the notebooks. I am still learning, so I may be missing something, but it seems like it could be an interesting gap for the GSoC project.
More generally, it also seems to me that the notebooks already cover several formulations (both parametric and semiparametric, and also different ways of working either directly with time-to-event or through hazard formulations). Because of that, as requested from the wiki, it feels like there may be room for an interface that avoids having to write ad hoc code for each case.
My next step is to dig deeper and get my hands dirty with more example to better eunderstand it, as well check CausalPy so I can better understand how the formula approach works, and in general become more comfortable with the codebase. But please let me know if you and the other mentors had other directions in mind.
As for the financial application, one direction I had been thinking about was recurrent event cases (as Im working with financial shocks I thought this could be interesting). I also noticed that the current examples mostly work with baseline or static covariates, but I was thinking that adding possibliity to model with time dependent covariates could be interesting if one wanted to bring in more information about how risk evolves over time. I still need to study that part more carefully, so I am mentioning it more as a possible direction and obviously open to discuss it further.

Regarding the proposal, should I communicate these ideas, or go deeper in how i imagined the usage of the interface?
Thank you,
Niccolò