I am planning to apply for GSOD 2021 and one of the aspects of my proposal is to alter the existing documentation model. I’ll send in my proposal today to Christopher Fonnesbeck sir, meanwhile, it would be interesting to get started on the work as well. One aspect of my proposal is to update the outdated regions of the documentation, and the getting started tutorial looks like a good starting point. @OriolAbril sir please weigh in your suggestions on the changes that you think might be needed and I’ll send in a pull request right away.
Hi, I’ll throw some ideas in the mix. Note that these are my ideas, I am not speaking on behalf of the whole project nor should you take this at face value and base your proposal on that alone.
The proposal outlined in the wiki has the following scope
The current documentation is already quite extensive as mentioned in the proposal too, especially regarding tutorials and examples/case studies/how to guides, however, It has been written progressively by the community during many years.
Due to this collaborative effort of people writing generally one notebook at a time, they don’t have much structure. Many notebooks don’t have a see also section linking to more advances cases or introductory warnings that you should be familiar with x and y notebook to be able to follow the current one… There is also little to no referencing between api reference docs and tutorials+examples docs. Moreover, even in the cases where some notebooks build on top of another one, they may not follow the same notation and conventions, so the reader needs to be able to abstract those concepts in order to relate the content between the different notebooks.
Due to this “temporal spread” coupled with the fast development pace as of late with externalization of plots and stats to ArviZ and the next major version backed by Aesara, most notebooks are now outdated and are no longer indicative of pymc3 best practices, in some cases they are not even indicative of pymc3 capabilities anymore!
In my opinion, the technical writing project should focus on (1), analyze the current state of the documentation, and formulate a plan for integration, revision and expansion, first building the resources and infrastructure for everyone to review and update the notebooks again in a collaborative effort. There are 85 notebooks of varying length, technical depth and nicheness (is this a word?), and updating them all requires advanced PyMC3, ArviZ and Aesara knolwedge, linear algebra, domain expertise in multiple fields… (not every notebook needs all these skills to be updated, but updating them all yes). Thus, it will probably be more efficient to structure and define how should the documentation be ideally, update or write some notebooks as an example of best documentation practices and have the rest of the community and outreachy intern (if we go ahead with the pymc-examples project) update the rest of the notebooks. i.e. I don’t think that having you (or any technical writer) understand, install, run and update the notebook that uses keras is the best use of anyone’s time.
Thanks for your opinion on the matter sir, I appreciate the detailed explanation. I understand your concern about the need for expertise when it comes to the documentation. I’m currently pursuing an undergraduate course focusing on different aspect of machine learning, high level linear algebra and data science. I realise that the skills needed for the documentation might be extensive, but I can ensure that during the timeline of the program, I can quickly analyse the required needs and contribute toward the development of concise documentation. In my proposal [submitted already], I have explained the evaluation metrics for the contributions completed. Updating the documentation based on the comments of the PyMC3 userbase on discourse is one of the aspect. In my work, I would highly appreciate constructive criticism and make continuous changes until it satisfies the users and the team. I have went through all the notebooks available on the webpage and plan on updating them based on the viewpoint of a python beginner learning to code. With the guidance of experts like you, I believe that significant improvements in the documentation can be achieved.
To be extra clear, my concern is not that you may lack expertise in pymc3/arviz/aesara/stats or anything similar. I wanted to emphasize that while we may not have too much time on our hands to update the notebooks or create new ones, we already have plenty of expertise on that front, so focusing GSoD on that adds value in overcoming these time constraints. On the other hand however, we don’t have much expertise on organizing documentation and creating a comprehensive and well integrated documentation corpus, or on setting up learning paths linking resources from the docs, separating documentation along different axis for better navigation and conciseness…, so I find this kind of expertise much more valuable as it provides a completely new skill set and point of view to the pymc project.
Apologies for the misunderstanding sir, that explains it clearly. I think we’re both on the same page on that aspect and it is something can be figured out. Based on the interactions with the team, a proper plan can be devised according to their needs.
The documentation needs some major changes in the existing layout as well. The current design can be changed and the proposed changes are included in my proposal. I have been working on my contributions since 25 and I will be sending the work to Christopher sir as a part of my weekly update by EOD 31 [IST]. My plan here essentially is to continue auditing the content and the layout of the webpage. Based on the recommendation of the team, I can make changes on the content and simultaneously sent the work done during that time. Should I send you my proposal through a personal text?
Hi, I’d just like to interject and point to a great resource for understanding HOW to write documentation:
Pymc3 already somewhat follows this (only somewhat ), but keeping these 4 pillars in mind may be very helpful in updating the docs to be coherent.
Yes, this is a great resource! It played an important role in the definition of our initial project proposal and I believe we have the link there. It has recently been moved to a new url https://diataxis.fr/ where it will continue to be updated and improved.
Thanks for the reviews @NowanIlfideme and @OriolAbril. For a reference I’ve included a sample proposal for changing the theme for the documentation at - Hierarchical Partial Pooling — PyMC3 3.10.0 documentation.
Here’s my work -
I’ve rewritten the content written there. Please send in reviews, also I haven’t included the results of the code and there are some more changes needed here.
Meanwhile I am planning to start more discussions for the other sections of the documentations.
This is the pydata theme right? Do you have the source to generate that?
More than anything because I changed the theme of the ArviZ documentation a while back and I don’t want you to dedicate too many time on theme heuristics. It can be tricky to organize the content so that both left and right sidebars show the right tables of contents and I have already worked on that quite a lot so I can help getting it right in 1-2 tries.
Yea I figured out a way to compile something similar to that.
Actually I can try and incorporate the details once all the documents have been converted to the theme. I worked on that because one aspect of the requirements in the GSOD page was to convert the Jupiter notebook based theme to an interactive layout.
The theme should work out of the box without any changes to the documentation sources, only to the
conf.py file. Another extra change we may want to do is using
MyST-NB instead of
nbsphinx like we do now, but these are two independent changes to address different problems.
Even if changing to MyST, the current files don’t need to be modified, they can be modified to take advantage of all the sphinx+rst features that they currently can’t used because they are using common markdown.
That sounds good sir, I agree with your idea.