Pymc3 versions and dependencies?

Is there documentation anywhere that lists or describes what versions of the various parts of the pymc3 jigsaw work together - i.e. versions of: python,anaconda,pymc3,theano/aesara,arviz - etc

I have used the docker published images - with some hacking - to get things working, but have also tried to create docker images - that just install anaconda and then install pymc3. this just results in lots of errors - due to not being able to solve build dependency sets.

It would be much easier to do this - if there was a way of knowing what the actual version dependencies are - right now I am relying on trial and error - and it is more often error.

A

PyMC3ā€™s current requirements can be found here. Aesaraā€™s are here and arvizā€™s are here. That being said, tracking down dependencies (and dependencies of dependencies) is generally a nightmare and you might consider asking about your particular scenario and see if anyone has any more general tips. I personally donā€™t use docker, but I know that others around here do.

1 Like

Yes, it is certainly not straightforward.

Just as an example, in the docs I have seen that the best (only reliable) way to install pymc3 is to use conda. It says pip can be used, but not recommended/stable.

However, if I try something simple like:

FROM continuumio/anaconda3

RUN conda install -c conda-forge pymc3

then - after a while it produces the error: ā€œfailed with initial frozen solveā€, followed by a bunch of other errors when it tries again.

If I try looking at the Dockerfile that was used to create the pymc3 image on Dockerhub - then it looks like everything required is just copied over from the original source directory - not as part of a build process in docker - so its difficult to know what is really going on.

A

A bit of an update ā€¦ In docker, I could get this to work:

FROM continuumio/anaconda3:2019.10

RUN conda install -c conda-forge pymc3

i.e. the install seems to work with the 2019.10 version of the anaconda3 image.

but I get the ā€˜frozen solveā€™ message with versions: 2020.02, 2020.07, 2020.11, 2021.04, 2021.05 & latest.

A

Other than python versions, the only combinations that are thoroughly tested are latest releases (and in some cases development versions too), you can take a look at the CI scripts in pymc or arviz to see which version matrices they use.

If you are interested in combinations of old versions that are compatible and work together, the safest bet is to take a look at the requirement files linked above by @cluhmann at the time of the release you are interested in, and take the minimum version that is set there. Example with invented dependencies if pymc3 3.7 has arviz>=0.7.0 and theano>=1.0.3 install exactly those versions, you may want to try increasing releases a bit from there which will probably work, but I donā€™t know of any better way to do that than trial and error, so itā€™s not ideal. If having support for older versions and version guidance is something valuable and interesting to you or your company, you should consider subscribing to Tidelift, which if will help you with version management and with enough subscribers could support and fund devs to work on older releases and version guidance.

And lastly, related to docker, we have arviz/scripts at main Ā· arviz-devs/arviz Ā· GitHub at ArviZ which builds a docker image with pymc, pystan, pyro and other ppls all at once, and I have never had any problem with it.

1 Like

Hi @andrew1, sorry I didnā€™t see this thread earlier. For what itā€™s worth, I have a very opinionated answer to your question.

You are making a few common mistakes due to IMO lack of documentationā€¦

  1. Youā€™re mixing the anaconda channel (managed by Continuum Analytics) with the conda-forge channel (community managed). Namely you are using the continuumio/anaconda3 Docker image, which comes preinstalled with several hundred MB of conda packages from the anaconda channel, and then installing pymc3 from the conda-forge channel. This is a recipe for conflicts. Since the current versions of PyMC3 on conda-forge and anaconda are 3.11.2 and 3.8 respectively, I would recommend using conda-forge, and thus choosing a different Docker image.
  2. Youā€™re using the conda command, which uses an incredibly slow and outdated solver. For any package with complex dependencies like PyMC3, you will be waiting potentially several hours for the conda solver to complete. I highly recommend switching to mamba, which is effectively a drop-in replacement for conda with a much, much, much better solver.

With all that said, IMO if you are using the tools correctly, then conda-forge should transparently handle all the dependency management issues for you. In case it does not work, and there is some problem with the resulting installation, then there is a bug in the recipe. (Indeed this is often the case.) For such a bug, it should be reported on the PyMC3 conda-forge feedstock, so that we can fix it.

Finally, my recommendation for your Dockerfile is

FROM condaforge/mambaforge:4.10.1-0
RUN mamba install pymc3

These ā€œmistakesā€ regarding mixing of channels and using conda instead of mamba are extremely, extremely common. I know, because I made them myself not so long ago. And if everyoneā€™s always making the same mistakes, that indicates to me that thereā€™s a documentation problem.

Regarding the documentation, in particular wrt mamba, I have already raised the issue a few months ago on the conda-forge gitter. The upshot seems to be that there is support for improving the documentation, and there is an ongoing project, so hopefully things will improve soon. If you are so inclined, Iā€™d encourage you to share your experience on Gitter, in particular regarding how the documentation could be improved.

Regarding the mixing of channels, I think a bit of the difficulty is that conda-forge and Continuum IO are distinct entities. Even though there is perhaps some ā€œcompetitionā€ between the two, Continuum hosts the massive conda-forge package library. The relationship seems to be very symbiotic, but personally I found it quite confusing at first.

I hope that helps to explain the situation a bit. Best of luck, and please let me know if I can be of further help.

3 Likes

Thank you for both of those replies - that is very helpful.

The situation is actually that I am thinking how to create a number of models - and am trying to think ahead about how these should be organized and deployed - as there are a number of other packages in the overall mix - and I suspect that I might hit some problems down the road unless things are properly arranged - but things are now becoming much clearer.

you are also right that I had not properly appreciated the point wrt anaconda and conda-forge, but now it all makes perfect (but a bit messy) sense.

A