This Time Self-Hosted
dark mode light mode Search

Example Code Should Be Exemplar (The Nathaniel’s Nudge)

In my decade working in bubbles, one of the things I definitely learnt better than I did in open source is doing reviews. Part of it is because review tooling has gone a long way from mailing lists, despite what hardcore real programmers think, but the most impact was from working with a lot more people who knew what they do.

Nathaniel is one of those people, and someone I miss working with quite a lot. He and I worked on some of the least recognized yet widely relied upon internal libraries, and he taught me a lot both in terms of review, and approaching these high-stake, low-gain libraries.

One of the things he taught me is the phrase “Example code should be exemplar”, in relation to making always sure that documentation, code samples and, to a large extent, tests should always be kept up to date when APIs are changed, and should follow the best practices that people are expected to follow to get their code approved. I call this the Nathaniel’s Nudge in the same spirit as Hyrum’s Law.

I think this might be overlooked often enough that it bears explaining why this is important. First of all, example code is part of the documentation — if your documentation is not recommending the best practices, something is definitely wrong with it. I have made this observation before, and that’s the main driver behind Autotools Mythbuster having a “Who’s Afraid of Autotools” section with the at-the-time best practice for that build system.

I have already discussed my dislike for oversimplified tutorials that try to provide relatable examples that are already widely discredited — such as objects describing people expecting single-word name and surname fields, mandatory titles or genders, or email address validation that forget that gTLDs exist. Example code is the same, except that it’s even more entrenched for newcomers to look at.

Polishing example code sometimes feels tedious — particularly when the code was written years before, and actual practice changed since. In one case that involved both Nathaniel and me, the example code was attached to a full deep-dive documentation piece that required involving a tech writer to adapt and review the text to go together with the code. It was a mostly thankless week of back and forth between the three of us, but I want to say that the result was satisfying: within a couple of weeks we stopped receiving code reviews that included the specific mistake that had me look into the example code.

This brings me to one of my personal chestnuts when looking at code in review: «Where did you learn that from?» Particularly when reviewing code for newcomers (to the company, language, or sometimes programming altogether!), it’s easy to spot code that looks too polished for them to have improvised, and yet not correct — which in my experience either comes from out of date documentation or code memes. More than once in my core library maintainer role, I decided to go out of my way to modernize callers’ code (which is much easier in an internal monorepo codebase) to avoid a specific mistake to be propagated to new code.

Keeping a “live” best practices example is the best option I have found to deal with the code meme problem: if you can send people to the sample code and tell them that it already follows best practices, they are a lot less likely to randomly search the codebase (or the net) for example of how to achieve whatever they are trying to do in the first place, and find results that are way out of date. But that requires the sample code is realistic enough.

Without repeating myself too much, if your sample code tries to cut all the corners by being the simplest example with no or few features in use, it might help as teaching material, but it won’t help for best practices. A common pushback I receive on this point is that if you add too many features, the code will become too complex to use as sample or documentation — this is a real problem, but I don’t believe that the right solution for it is to keep the example unrealistic.

To take another abstract example from the past, I have often reviewed code for automated, non-interactive pieces of software — call them “scripts” if you like, although that feels like significantly undervaluing the amount of effort required in making those software components work together. These components need to be able to expose some state to both humans and other systems when they run, even though they are not otherwise interactive — for example they may need to expose metrics to a scraper such as Prometheus.

When I looked at the steps necessary to provide such an interface based on exemplar code, it was messy: the same or similar value had to be passed as parameter to different functions, methods that were at one time optional became mandatory but you still had to call them explicitly, and so on. The easiest case, of a processing binary that only required an interface to be scraped for metrics, would require easily twelve lines of initialization boilerplate!

You could have “simplified” the example code by removing all of these calls and parameters that are necessary in production, and focus on the non-boilerplate parts… or you could do what we did, and provide a better abstraction. Since my work was almost entirely focused on Python, the Pythonic answer for that was to provide a less flexible, but easier to use library as a context manager.

Reducing the flexibility in this case is a very intentional step. Just like schemas and formats, the value is added by reducing the uncertainty. If you know that a binary named “coffeegrinder” will report itself as “coffeegrinder” (rather than “Grinder”, “Coffee Grinder”, “Koffee Griner”, etc) you know what to look for without having to second-guess yourself. And this is another place where keeping example code exemplar can help: how do you handle spaces in service names? If you have them normalized differently (dashes vs underscores), how does it look like when you try to use them? Do you need three separate constants for “Water Boiler”, “water_boiler”, and “water-boiler” depending where you consume them? Or would you rather let eh user provide the one string, and normalize it down the stack? Looking at it from a minimalist, but up to best practices, example might help improving your interfaces for everyone.

I said before that a lot of the things that are possible to do in a bubble are the effect of a long series of independent choices that just happen to get to a nexus of tools being available to make them come true. Exemplar code is one of those choices, but it won’t get you as far as I have described without other tools: a review system for docs, CASE tools for code modification and large scale changes, and code validation that can stop known-incorrect patterns from being introduced.

But consider starting at least with something, and heed Nathaniel’s Nudge: example code should be exemplar! As for myself, I spent some time updating Zephyr’s boards definitions to follow the same comments I have received when sending my own first board pull request, and I will keep doing this as I keep exploring the project. Because if anything, I would like to see more people follow my… example.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.