On Rake Collections and Software Engineering

autum, earth's back scratcher

Matthew posted on twitter a metaphor about rakes and software engineering – well, software development but at this point I would argue anyone arguing over these distinctions have nothing better to do, for good or bad – and I ran with it a bit by pointing out that in my previous bubble, I should have used “Rake Collector” as my job title.

Let me give a bit more context on this one. My understanding of Matthew’s metaphor is that senior developers (or senior software engineers, or senior systems engineers, and so on) are at the same time complaining that their coworkers are making mistakes (“stepping onto rakes”, also sometimes phrased as “stepping into traps”), while at the same time making their environment harder to navigate (“spreading more rakes”, also “setting up traps”).

This is not a new concept. Ex-colleague Tanya Reilly expressed a very similar idea with her “Traps and Cookies” talk:

I’m not going to repeat all of the examples of traps that Tanya has in her talk, which I thoroughly recommend for people working with computers to watch — not only developers, system administrators, or engineers. Anyone working with a computer.

Probably not even just people working with computers — Adam Savage expresses yet another similar concept in his Every Tool’s a Hammer under Sweep Up Every Day:

[…] we bought a real tree for Christmas every year […]. My job was always to put the lights on. […] I’d open the box of decorations labeled LIGHTS from the previous year and be met with an impossible tangle of twisted, knotted cords and bulbs and plugs. […] You don’t want to take the hour it’ll require to separate everything, but you know it has to be done. […]

Then one year, […] I happened to have an empty mailing tube nearby and it gave me an idea. I grabbed the end of the lights at the top of the tree, held them to the tube, then I walked around the tree over and over, turning the tube and wrapping the lights around it like a yuletide barber’s pole, until the entire six-string light snake was coiled perfectly and ready to be put back in its appointed decorations box. Then, I forgot all about it.

A year later, with the arrival of another Christmas, I pulled out all the decorations as usual, and when I opened the box of lights, I was met with the greatest surprise a tired working parent could ever wish for around the holidays: ORGANIZATION. There was my mailing tube light solution from the previous year, wrapped up neat and ready to unspool.

Adam Savage, Every Tool’s a Hammer, page 279, Sweep up every day

This is pretty much the definition of Tanya’s cookie for the future. And I have a feeling that if Adam was made aware of Tanya’s Trap concept, he would probably point at a bunch of tools with similar concepts. Actually, I have a feeling I might have heard him saying something about throwing out a tool that had some property that was opposite of what everything else in the shop did, making it dangerous. I might be wrong so don’t quote me on that, I tried looking for a quote from him on that and failed to find anything. But it is something I definitely would do among my tools.

So what about the rake collection? Well, one of the things that I’m most proud of in my seven years at that bubble, is the work I’ve done trying to reduce complexity. This took many different forms, but the main one has been removing multiple optional arguments to interfaces of libraries that would be used across the whole (language-filtered) codebase. Since I can’t give very close details of what’s that about, you’ll find the example a bit contrived, but please bear with me.

When you write libraries that are used by many, many users, and you decide that you need a new feature (or that an old feature need to be removed), you’re probably going to add a parameter to toggle the feature, and either expect the “modern” users to set it, or if you can, you do a sweep over the current users, to have them explicitly request the current behaviour, and then you change the default.

The problem with all of this, is that cleaning up after these parameters is often seen as not worth it. You changed the default, why would you care about the legacy users? Or you documented that all the new users should set the parameter to True, that should be enough, no?

That is a rake. And one that is left very much in the middle of the office floor by senior managers all the time. I have seen this particular pattern play out dozens, possibly hundreds of times, and not just at my previous job. The fact that the option is there to begin with is already increasing complexity on the library itself – and sometimes that complexity gets to be very expensive for the already over-stretched maintainers – but it’s also going to make life hard for the maintainers of the consumers of the library.

“Why does the documentation says this needs to be True? In this code my team uses it’s set to False and it works fine.” “Oh this is an optional parameter, I guess I can ignore it, since it already has a default.” *Copy-pastes from a legacy tool that is using the old code-path and nobody wanted to fix.*

As a newcomer to an environment (not just a codebase), it’s easy to step on those rakes (sometimes uttering exactly the words above), and not knowing it until it’s too late. For instance if a parameter controls whether you use a more secure interface, over an old one you don’t expect new users of. When you become more acquainted with the environment, the rakes become easier and easier to spot — and my impression is that for many newcomers, that “rake detection” is the kind of magic that puts them in awe of the senior folks.

But rake collection means going a bit further. If you can detect the rake, you can pick it up, and avoid it smashing in the face of the next person who doesn’t have that detection ability. This will likely slow you down, but an environment full of rakes slows down all the newcomers, while a mostly rake-free environment would be much more pleasant to work with. Unfortunately, that’s not something that aligns with business requirements, or with the incentives provided by management.

A slight aside here. Also on Twitter, I have seen threads going by about the fact that game development tends to be a time-to-market challenge, that leaves all the hacks around because that’s all you care about. I can assure you that the same is true for some non-game development too. Which is why “technical debt” feels like it’s rarely tackled (also on the note, Caskey Dickson has a good technical debt talk). This is the main reason why I’m talking about environments rather than codebases. My experience is with long-lived software, and libraries that existed for twice as long as I worked at my former employer, so my main environment was codebases, but that is far from the end of it.

So how do you balance the rake-collection with the velocity of needing to get work done? I don’t have a really good answer — my balancing results have been different team by team, and they often have been related to my personal sense of achievement outside of the balancing act itself. But I can at least give an idea of what I do about this.

I described this to my former colleagues as a rule of thumb of “three times” — to keep with the rake analogy, we can call it “three notches”. When I found something that annoyed me (inconsistent documentation, required parameters that made no sense, legacy options that should never be used, and so on), I would try to remember it, rather than going out of my way to fix it. The second time, I might flag it down somehow (e.g. by adding a more explicit deprecation notice, logging a warning if the legacy codepath is executed, etc.) And the third time I would just add it to my TODO list and start addressing the problem at the source, whether it would be within my remit or not.

This does not mean that it’s an universal solution. It worked for me, most of the time. Sometimes I got scolded for having spent too much time on something that had little to no bearing on my team, sometimes I got celebrated for unblocking people who have been fighting with legacy features for months if not years. I do think that it was always worth my time, though.

Unfortunately, rake-collection is rarely incentivised. The time spent cleaning up after the rakes left in the middle of the floor eats into one’s own project time, if it’s not the explicit goal of their role. And the fact that newcomers don’t step into those rakes and hurt themselves (or slow down, afraid of bumping into yet another rake) is rarely quantifiable, for managers to be made to agree to it.

What could he tell them? That twenty thousand people got bloody furious? That you could hear the arteries clanging shut all across the city? And that then they went back and took it out on their secretaries or traffic wardens or whatever, and they took it out on other people? In all kinds of vindictive little ways which, and here was the good bit, they thought up themselves. For the rest of the day. The pass-along effects were incalculable. Thousands and thousands of soul all got a faint patina of tarnish, and you hardly had to lift a finger.

But you couldn’t tell that to demons like Hastur and Ligur. Fourteenth-century minds, the lot of them. Spending years picking away at one soul. Admittedly it was craftsmanship, but you had to think differently these days. Not big, but wide. With five billion people in the world you couldn’t pick the buggers off one by one any more; you had to spread your effort. They’d never have thought up Welsh-language television, for example. Or value-added tax. Or Manchester.

Good Omens page 18.

Honestly, I often felt like Crowley: I rarely ever worked on huge, top-to-bottom cathedral projects. But I would be sweeping around a bunch of rakes, so that newcomers wouldn’t hit them, and that all of my colleagues would be able to build stuff more quickly.