Hi everyone,
This discussion is in reference to the PyMC Extras issue on local sensitivity checks for posterior predictions (here).
Following the suggestion in that thread, I’ve put together a small exploratory notebook to play around with a few ideas. The main thing I was trying to understand is how sensitive posterior predictions are to small changes in the observed data, for example by removing a single observation and comparing how much the predictions change. I also experimented a bit with simple norm-based measures and briefly looked into gradient-based approaches using autodiff.
This is very much exploratory and not meant to be a concrete API proposal. I mainly wanted to get a feel for whether these kinds of checks are useful in practice and how they relate to existing approaches like PSIS-LOO, k-hat, or the sensitivity tools already in ArviZ.
Here is the notebook:
I’d really appreciate any thoughts or feedback, especially on whether this direction makes sense, if there are better ways to think about it, or if this overlaps too much with existing tools.
Thanks a lot!