That’s really cool. How does it work? Are you able to do this tightly or do you marginalize over the entire density? I was considering doing the latter in Stan since we can’t pull out a Markov blanket, but I thought it’d be too slow. I know Matt Hoffman and Maria Gorinova worked on this problem, too, where they looked for cases in the graphical model where they could unfold analytically.
I have a tutorial in the Stan User’s Guide on how to manually marginalize a change-point model like this, although I only go over the single change point case and I don’t think the Stan code can be easily ported to PyMC. The issue with marginalizing is that if there are 300 time points, marginalizing out two cut points is going to cost roughly 300^2 / 2 (about 50K) entire density evaluations, one for each pair of cut point locations. At some point, you’ll probably have to resort to discrete sampling with constraints so the change points are ordered properly.