Complaint Monday - What has been bothering you about PyMC?

ricardoV94 · June 12, 2023, 8:19am

Feel free to write any issues / bugs / missing features that have been bothering you.

Would love to hear from our users

jaharvey8 · June 14, 2023, 12:16am

Saving then loading a trace and model so that you can make new out of sample predictions with a previously created model still feels awkward and clunky. And, given how common this practice this is, it seems like there should be an example notebook versus new users having to search the discourse for how to do this.

ricardoV94 · June 14, 2023, 5:09am

What do you think of Using ModelBuilder class for deploying PyMC models — PyMC example gallery ?

It is being developed to try and answer the problem you mention.

jaharvey8 · June 14, 2023, 10:49am

Oh wow, I knew this was being developed but I was unaware that it had been added to pymc-experimental. This looks great, I’ll try it out.

Regarding my “complaint”, and getting philosophical for a second, I think pymc exists in this realm where it’s like you make these super amazing models that are highly descriptive and you infer and learn a lot about your data. But then it kind of stops there, predicting on new data is time consuming. Whereas machine learning exists more as like you’ll make these models, not really know why it works and won’t really learn anything about the data. But it’ll be really easy to predict on new data. So I think this functionality is going to be really powerful and helps bridge that “learn about your data” and “deploy on new data” gap.

juststarted · June 15, 2023, 7:20am

As a preliminary remark, I love the PyMC framework, I have been using it for years and I think it’s an amazing project. That being said, I usually fit models on a server, and since I often work with pretty heavy likelihood functions & a lot of data the fitting time can easily exceed the maximum job length. This is a problem because there still is no way of interrupting sampling, saving the trace, and restarting from where it left off. McBackend is meant to do this but something is broken (see this issue). I imagine this is a fairly common problem for other users as well, so any improvement on this front would be awesome!

ricardoV94 · June 15, 2023, 8:05pm

I think that libraries like blackjax make resuming sampling much easier because the samplers are written in a functional paradigm. @junpenglao may be able to confirm or deny.

I wonder if nutpie makes this easier as well? CC @aseyboldt

Our native samplers should also be able to do this but would require refactoring. Simple samplers like Metropolis and Slice should be easy (not too many tuning variables to keep track of). NUTS may be a tough beast though.

junpenglao · June 16, 2023, 4:24am

Yep you can use blackjax for that as it is written to be more modular. I think it wont even take that much of a refactoring for our native sampler to do so.

juststarted · June 19, 2023, 7:30am

Thanks for the help!

Unfortunately my model does not compile with any of those (nutpie, blackjax, numpyro) and I haven’t found a way to debug the problem. It likely has to do with some complicated shape issue / the scan operation. It runs fine with pm.sample and parameter recovery even worked well with a (much) smaller dataset than the one I have (though variational inference did not - the posterior for the last component of a dirichlet was systematically underestimated).

Topic		Replies	Views
A lot of appreciation for the PyMC team work along with a documentation suggestion	1	58	December 5, 2024
Reproducibility & Scalability with PyMC Sharing	2	66	April 17, 2025
Bayesian Data Production Operations version agnostic modeling	2	363	October 30, 2022
Any examples of using a dataloader with a pymc model? Development development , modeling	4	375	June 11, 2023
Looking to hire someone to help me 1 hour daily for a week integrating PyMC with my ML model Questions	1	510	June 7, 2020

Complaint Monday - What has been bothering you about PyMC?

Related topics