Interest in GSoC 2026- Spatial Modeling project

mikjkd · February 6, 2026, 1:13pm

Hi everyone,

My name is Michele Di Giovanni (GitHub: mikjkd). I recently graduated with a PhD in computer science, and I’m interested in applying to the GSoC “Spatial modeling” project for PyMC.

As recommended for prospective GSoC students, I’ve already made a small contribution to PyMC: PR #8090 (Stan vs PyMC Rosetta Stone notebook): Add Stan vs PyMC Rosetta Stone notebook (refs #7771) by mikjkd · Pull Request #8090 · pymc-devs/pymc · GitHub

I’d like to discuss the project scope with the potential mentors, @bwengals and @fonnesbeck to decide which algorithm would be the best fit and most valuable to implement this year.
I’m particularly interested in BYM/BYM2 models, but I’m flexible based on PyMC’s priorities and constraints.

Thanks a lot,
Michele Di Giovanni

fonnesbeck · February 6, 2026, 8:48pm

Hey Michele,

Thanks for getting in touch about GSoC. We are excited for another fun summer of PyMC development! It is still early days in the process and we are basically in a holding pattern until we know how many slots we will get in the program this year. Once we do that, we will put out a call for students at which point you would register with GSoC, submit a proposal, and our team would select the top proposals, corresponding to the number of slots we get. We will post more details once we know more.

mikjkd · February 9, 2026, 10:42am

Hi Chris,
Thanks for the update — that makes perfect sense.

I reached out also because I understand it’s good practice to get in touch with organization members before submitting a proposal, mainly to align on what would be most useful for PyMC and what a strong proposal should contain.

In the meantime I’ll keep contributing to PyMC where I can, and I’ll start drafting a proposal around spatial modeling (likely ICAR/CAR and a BYM-style implementation), so I’m ready when the call opens.

If you have any pointers on what you’d like to see in proposals for this topic (scope, milestones, API/design preferences), I’d really appreciate it.

Thanks again,
Michele

daniel-saunders-phil · February 9, 2026, 7:44pm

Hi Michele, as you look into the BYM, CAR, ICAR work, you should check out the notebooks here to get a sense of current capacity.

There are few features we are missing on ICAR:

a random draw function Add `rng_fn` to CAR/ICAR · Issue #7713 · pymc-devs/pymc · GitHub
The ability to handle disconnected graphs. Mitzi Morris is really on top of this stuff: GitHub - mitzimorris/geomed_2024: Spatial Models in Stan
The zero-sum parameterization is apparently much much faster but not how we do things currently. The Sum-to-Zero Constraint in Stan

ricardoV94 · February 10, 2026, 5:04pm

That first issue has some pushback (from me), if you have insight that it is valid to sample from it please weigh in

mikjkd · February 10, 2026, 6:49pm

Hi @ricardov94 and @daniel-saunders-phil — from my reading of #7713 and the references linked there, I understand that the main reason behind Ricardo’s pushback is that “sampling from ICAR”, in its usual formulation, can be interpreted as sampling from an improper distribution, i.e. not a proper probability density on \mathbb{R}^n. This is mathematically ambiguous and potentially misleading to users. Providing an rng_fn in that setting risks giving users the impression that they are drawing from a well-defined prior on \mathbb{R}^n, when in fact the distribution is only defined up to an arbitrary gauge/constraint.

I realize that one can still generate draws from a singular Gaussian measure by working in the identifiable subspace (e.g. via an SVD/EIG decomposition and fixing the nullspace component), as described for instance here.

However, at the moment I’m struggling to come up with concrete user-facing examples where this would be clearly appropriate and not misleading, unless we also make the gauge/constraint explicit (and document it very prominently). In other words, the math can be made operational, but I’m not convinced it maps cleanly to a typical “prior predictive” use case for ICAR as currently implemented.

Happy to be corrected if there are common workflows where users explicitly want “ICAR draws under a fixed gauge/constraint”.

Also, thanks @daniel-saunders-phil for putting together the notebooks and the feature list — they’re really helpful for understanding the intended direction. Are there any specific papers / references you have in mind that could be turned into a concrete implementation task (beyond rng_fn), or any feature from that list that would be a good next step to tackle?

daniel-saunders-phil · February 10, 2026, 7:38pm

Hi @mikjkd, thanks a lot! I appreciate your perspective on the sensibility of the draw function.

I don’t have a strong opinion on what to do about the draw function in ICAR. I’m thinking from a more high-level point of view from here. But I’ve noticed people stay away from ICAR because putting it on your model breaks typical prior sampling and out-of-sample prediction workflows. They pick GPs instead because our forward sampling support is alright. So if there is something to be done to rectify the situation, wonderful. If that’s just a fundamental limit on ICAR, that’s okay too, let’s just close that whole inquiry.

Question about the api for a fixed constraint bit: is a zero-sum constraint the same thing as the kind of constraint you are talking about? One thing I find confusing the sampling discussion around ICAR is that, when working through the algebra, the covariance matrix is clearly singular. But is it still singular after you apply the zero-sum constraint? I thought the zero-sum bit is precisely what allows you to take valid MCMC samples. And if you can take mcmc samples, then surely there must exist a forward sampling method that takes valid draws from the same distribution, even if we don’t know what the method is yet.

ricardoV94 · February 11, 2026, 9:39am

My very high level reading is that it’s akin to a flat prior. You can’t plausible sample from the prior, but after conditioning on data, the posterior becomes proper and you can sample from it.

But the posterior is no longer a flat. There’s obviously no rng for flat.

It could be different, in that sampling from this parametrization is correct (and not just a trick to avoid numpy raising an error). But I haven’t seen the evidence this is the case.

mikjkd · February 26, 2026, 9:10pm

Hi @fonnesbeck ,

Thanks again for your reply.

I saw that the GSoC slots have now been allocated. Would it be possible to go a bit more in depth with you and the potential mentors on what would be most valuable to implement this year for the Spatial modeling project? I’m happy to adapt to PyMC’s priorities and constraints.

Best regards,
Michele Di Giovanni

fonnesbeck · February 27, 2026, 7:02pm

@mikjkd the contributor application period opens March 16, so its probably a good time to start pulling together a proposal.

Adebule_Emmanuel · March 28, 2026, 8:27am

Hi everyone, good day

My name is Adebule Emmanuel. I’m a 200-level Statistics student at the University of Jos, and I’m also currently taking the Google Data Analytics program. My background is mainly in probability, statistical inference, and statistical computing, and I have a basic working knowledge of Python.

I’ve been exploring PyMC for a while now, and I really like how it connects statistical theory to real-world problems. That’s something I’m genuinely interested in, especially in areas like spatial modelling and data-driven analysis.

I’m still learning, but I’m consistent, curious, and willing to put in the work. I’d really love the opportunity to contribute, grow, and learn from the community here.

I’d also appreciate the chance to connect with @bwengals and @fonnesbeck to get guidance on which ideas or algorithms would be most useful to focus on for this year’s GSoC.

I’m particularly interested in Gaussian Processes, but I’m very open to suggestions and happy to work on whatever aligns best with PyMC’s goals.

Thanks for your time, I’m looking forward to being part of the community.

Adebule Emmanuel

Topic		Replies	Views
GSoC Project - 2023 - Spatial Modeling Development	6	791	April 11, 2023
GSoC 2025 Application: Introduction and Interest in Spatial Modeling Project Development gsoc2018 , gsoc2021 , gsoc	8	192	April 8, 2025
GSoC 2026: Interested in Spatial Modelling Development	0	31	March 17, 2026
Regarding the project- Spatial Modelling GSOC'24 Development gsoc	4	380	February 9, 2024
Proposal Submission Update for GSoC Spatial Modeling Project Development gsoc2021 , gaussian_process , modeling , gsoc2023 , gsoc	2	80	April 8, 2025

Interest in GSoC 2026- Spatial Modeling project

Related topics