Code Validation and Reviews

Flameeyes

5 years ago

During my “staycation” week I decided to spend some time looking at my various Python projects, trying to make it easier, nicer and cleaner both for contributors and for myself. I also decided to spend some time to improve some of the modules I use, based on various lessons I learned in my own tools — and that got me to learn more about those tools.

First of all, thanks to Ben, glucometerutils has been fully re-formatted with black and isort already. And he set it up with pre-commit to make sure that new changes don’t break these formattings. This is awesome.

As I was recently discussing with Samuele on Twitter, it’s not that I always agree with the formatting choices in black. It’s that it takes away the subjective matter of agreement on formatting. Black does not always make the code look like the way I would make it look like, and I could argue that I can do a better job than black. But it’s also true that it makes everybody’s code look the same, which is actually a great way to fix it.

This is something I definitely picked up in my dayjob — I have been part of the Python Readability program (you can find more about it in the Building Secure and Reliable Systems book, downloadable for free), and through that I have reviewed literally hundreds of changes, coming from all different parts of Alphabet. While my early engagement had lots of comments on code formatting, the moment when Python formatting tools became more reliable and their usage became widespread, the “personal load” of doing those reviews went down significantly, as I could finally focus on pointing out overly complex functions, mismatching code and documentation, and so on. For everything else, my answer was unchanging: “let the formatter do its job, maybe just add a comma there as a suggestion to it.”

This is why I’m happy about black — not that I think its formatting is 100% what I would do. But because it gets close enough, and removes the whole argument or uncertainty around it. The same applies to isort, as well.

While applying pretty much the same set of presubmits and formatting to python-pcapng, I also found out flake8. This is another useful tool to reduce the work needed for reviews, and it also can be configured to run as part of the pre-commit hooks, making sure that violations are identified sooner rather than later. While the tool is designed to deal with styleguide violations, it also turned out to identify a few outright mistakes in glucometerutils. I’m now going to apply it throughout, whenever I can.

There’s more checks I would actually want to integrate — today I was going through all the source files in glucometerutils to update the type annotations, since I dropped Python 3.6 support. As I went to do that I realised that one of the files created in a pull request I approved recently was actually missing licensing information. I have now added both license and copyright annotations as suggested by Matija (who is an actual lawyer, unlike me) — but would love a pre-commit check that just ensured that all the files have a license, copyright, and that they have the expected license, for instance.

There’s a few more trivial checks available in pre-commit that I may actually enable throughout: checks for trailing whitespace, and missing newlines at end of files. All of those are easily fixed, and the fixers do exactly that, which is also a great way to make the tests easier on newcomers and contributors: you don’t just get told that “it’s wrong”, but also “let me fix that for you already”.

It’s not quite as encompassing as the bubble I’m used to, but it seems to be the closest I’m getting to it right now. Maybe I should just start building all the hooks that I feel I need, and see if someone else will adopt them afterwards. It seems to be a common thing to do after all.

Anyone who has written Gentoo ebuilds, by the way, have most likely recognized similarities with repoman, the tool used to validate ebuilds before submitting them to the tree. I think that’s possibly why I’m so interested in this. Because I do agree that tools like repoman are the way to go, and have insisted myself for repoman to be extended to cover more and more cases over time, as it would stop divegence.

I honestly hope to get to a point where there’s no argument made over a code change, on whether it complies to style or not — but rather leaving the enforcement (and the fixing) to computers, whenever it is possible. And that also means helping the computers making it possible by being less picky on things that can be overlooked.

Share this: