Topic about the outreachy project proposal titled Improve PyMC3 example notebooks for PyMC3 4.0 and Aesara. This topic is aimed at outreachy applicants, for ideas on how to improve pymc-examples, pymc docs in general or to propose a new notebook for the repo, create a new topic or open an issue at pymc-examples.
As you can see in the project description, the only hard requirement to apply to this project is being experimented working with the GitHub workflow, forking repos and using multiple branches to work on multiple features at once to eventually submit a PR that can be merged into the main
branch. In pymc3 and pymc-examples, there are also some other requirements to be met while working, code style and formatting, testing, collaboration guidelines… These are described in the contributing guide.
The requirement to apply to this project is to have opened (note opened, not have merged) at least 3 PRs to pymc-examples before April 29th with at least two of them being open at the same time. To minimize git conflicts in pymc-examples, where most of the content are jupyter notebooks, each PRs should modify only 1-2 notebooks and it is therefore crucial to be able to work on multiple branches at the same time.
I have opened an issue per notebook using the tracker id label. I am progressively going over the notebooks to add some extra labels and guidance that is specific to the respective notebook, but I won’t be able to cover all the notebooks and it should not be a problem. I have also written a somewhat detailed wiki page on things that need to be updated. I will keep updating this document as more notebooks are updated and I see more examples of outdated code.
You should consider all issues with the label “tracker id” that are not in the “Best Practices” column of the notebook tracker as open, even if I have not yet added specific labels and guidance to the issue. This is not a requirement to apply, but I encourage you to go over at least one notebook with no specific guidance on the issue and see for yourselves where and how it can be improved, both looking at the wiki page linked above and as a reader. There could be many other things to improve in the notebooks both in terms of code as in terms of content.
At the time of writing, there are only 6 issues in the “Best Practices” column plus some open PRs, which should leave ~80 issues to choose from. Let me know if you need help choosing a notebook to get started with.
cc @Abhipsha_Das @missaishagurung
Note on VI project: The requirements for the “Integrate Variational Inference with the JAX backend” project also imply cloning pymc-examples and working on a dedicated branch on a new f-divergences notebook. You may consider applying to both projects, so we want to minimize “duplicated” requirements. Here is how the work on VI requirements can partially cover the requirements for pymc-examples project.
The f-divergences branch can count as one of the 3 PRs if it was created from main
and you send me the link to the branch too or if you open a PR (you can use the “no merge” label for that). Moreover, if you show that you have intertwined commits between the f-divergences branch and one of your other branches from which you have submitted a PR, this can also count as the 2 simultaneous PRs. For example having a branch outreachy-name
with commits on April 9th, April 13h and April 22nd and no open PR from it plus a branch notebook-xyz
with commits on April 11th, April 15th and April 19th and an open PR from it will count as if you had two PRs open at the same time.
Let me know if something is not clear of if you need any help with pymc, git or anything else related to the application.