A lot of appreciation for the PyMC team work along with a documentation suggestion

Hi all!, I’m not sure if this is the right place to post this, I’m happy to move it or delete it if it’s not.

I wanted to express my gratitude for the fantastic library you’ve developed. I can see it’s taken a lot of work over many years and, while sadly effort is not always tied to excellent results, happily this is not the case.

Funnily what I’ve found most salient about my PyMC journey is that I’ve struggled a lot, up to a point that I would have most likely given up with most other libraries. This may sound like a negative, but what’s relevant to me is that I’ve never felt like giving up with PyMC, because the process has been so fun despite the obstacles. I’d say that we all like the easy stuff, but we only like the hard things when they are meaningful in some sense, and it’s in this context that my struggles turn into a source of praise.

Now I’m sure you all want the learning curve to be flatter (I certainly wouldn’t have minded a quicker adaptation), as it’s evident from the wide variety of great tutorials and examples you’ve made, plus the active (and very well disposed) participation on this forum (another thing to be grateful for). I think that the didactic effort is particularly commendable given how few of you there seem to be: I tend to see the same names around here, which has its upsides (it brings about some sense of familiarity that’s oddly comforting, in spite of the lack of actual interaction) and its downsides (limited people have limited time).

Anyway, the thing is that at some level I find something missing, but when I look at all the factors that would make it improve I see they are all there. Maybe it’s just that PyMC has a high degree of irreducible complexity which in a way is understandable given how useful it is (i.e., it’s some sort of “no free lunch”). Still I figured I could share my perspective in case it’s helpful.

I found two main resources that seem aimed at addressing what I have in mind, the first one is:
How to debug a model — PyMC example gallery, which has the title I’m looking for but I’ve been unable to benefit from the content. The second one is Distribution Dimensionality — PyMC 5.19.0 documentation, which points at the crucial point of dimensioning variables correctly but for some reason it’s a text I’ve only been able to fully appreciate in hindsight. I think the last section “tips for debugging shape issues” is great though (the advice of using prime numbers is brilliant) and it’s a good complement to Ricardo’s comments (not just the linked one) in this discussion: Debug mode in PyTensor - #6 by ricardoV94.

In summary, what I have in mind would be a “How to debug” guide in the “Core Notebooks” section (Notebooks on core features — PyMC 5.19.0 documentation) with Ricardo’s advise on debugging (i.e., ignoring pytensor, building models incrementally, checking shapes) along with the aforementioned tips and certain calls that have been particularly useful to me and I leave below. I understand this is based on my particular experience and I wouldn’t be surprised if it’s not relevant to most people, I just wanted to bring it up in case it’s worth considering.

model["variable_name"].shape.eval()  # I found quite useful knowing that .eval() is the key to getting around pytensor objects
pm.draw(model["variable_name"])
# Rule of thumb: if draw works and sample doesn't, review parametrisation
# When there's an error remove dims or size arguments and see what shapes come out
pm.logp(model["variable_name"], valule).eval()  # particularly useful for custom distributions
# The following may be too specific:
model.point_logps()  # to check logp of custom dists has been derived
model.initial_point()  # to check when there are nans or -inf above / issues with initialisation
model["variable_name"].get_parents()[0].inputs[1]  # to inspect dependencies and check a copy is not being instantiated

Anyway, the big takeaway is a great deal of appreciation for your work, I mentioned this issue because I felt it was my chance at making a tiny contribution but that’s just 1% of the story. Apart from PyMC I also love some side projects that have grown around it and that are forming a wonderful ecosystem: arviz, preliz and bambi come to mind (the last one I haven’t worked with that much but I find it a very interesting attempt at simplification). I’m also excited about new developments (e.g., the stuff I see at pymc experimental, specially related to time series models, I’m looking forward to stop using scan although I’ve read somewhere it’s not a reasonable expectation; I’m also very keen on the HSGP idea which I believe is a relatively recent incorporation).

It’s so pleasing to work in a field when someone has already taken care of developing a lot of the necessary machinery for the job. Thanks for the fantastic work, it really makes a difference!!!

4 Likes

Thanks for the warm feedback. Do not hesitate in contributing to the “how to debug” notebook and we can certainly make that a core notebook. Makes total sense :wink:

2 Likes