Testing is Like Onions: They Have Layers, And They Make You Cry

As a Free Software developer, and one that has worked in a number of separate projects as well as as totally different lines of work, I find myself having nuance and varied opinions on a bunch of topics, which sometimes don’t quite fit into the “common knowledge” shared on videos, blog posts or even university courses.

One such opinion relates to testing software in general. I have written lots about it, and I have ranted about it more recently as I was investigating a crash in unpaper live on-stream. Testing is undoubtedly one of the most useful techniques for developers and software engineers to build “properly solid” software. It’s also a technique that, despite a lot of writing about it, I find is nearly impossible to properly teach without first hand experience.

I want to start this by staying that I don’t believe there is an universal truth about testing. I don’t think I know everything there is to know about testing, and I speak almost exclusively from experience — experience that I acquired in now over ten years working in different spaces within the same industry, in sometimes less than optimal ways, and that has convinced me at times that I held the Truth (with capital T), just to crush my expectations a few months later.

So the first thing that I want you all to know, if you intend on starting down the path of caring more about testing, is to be flexible. Unless your job is literally responsible for someone’s life (medical, safety, self-driving), testing is not a goal in and by itself. It rather is a mean to an end: building something to be reliable. If you’re working on corporate project, your employer is much less likely to care that your code is formally verifiable, and more likely to care that your software is as bug-free as possible so that they can reap the benefits of ongoing revenue without incurring into maintenance costs.

An aside here: I have heard a few too many times people “joking” about the fact that proprietary, commercial software developers introduce bugs intentionally so that they can sell you an update. I don’t believe this is the case, not just because I worked for at least a couple of those, but most importantly because a software that doesn’t include bugs generally make them more money. It’s easier to sell new features (or a re-skinned UI) — or sometimes not even that, but just keep changing the name of the software.

In the Free Software world, testing and correctness are often praised, and since you don’t have to deal with product managers and products overall, it sounds like this shouldn’t be an issue — but the kernel of truth there is that there’s still a tradeoff to be had. If you take tests as a dogmatic “they need to be there and they need to be complete”, then you will eventually end up with a very well tested codebase that is too slow to change when the environment around it changes. Or maybe you’ll end up with maintainers that are too tired to deal with it at all. Or maybe you’ll self-select for developers who think that any problem caused by the software is actually a mistake in the way it’s used, since the tests wouldn’t lie. Again, this is not a certainty, but it’s a chance it can happen.

With this in mind, let me go down the route of explaining what I find important in testing overall.

Premise and preambles

I’m going to describe what I refer to as the layers of testing. Before I do that, I want you to understand the premise of layering tests. As I said above, my point of view is that testing is a technique to build safe, reliable systems. But, whether you consider it in salary (and thus hard cash) in businesses or time (thus “indirect” cash) in FLOSS projects, testing has a cost, and nobody really wants to build something safely in an expensive way, unless they’re doing it for fun or for the art.

Since performative software engineering is not my cup of tea, and my experience is almost exclusively in “industry” (rather than “academic”) setting, I’m going to ignore the case where you want to spend as much time as possible to do something for the sake of doing something, and instead expect that if you’re reading further, you’re interested in the underlying assumption that any technique that helps is meant to help you produce something “more cheaply” — that is the same premise as most Computer-Aided Software Engineering tools out there.

Some of the costs I’m about to talk about are priced in hard cash, other are a bit more vacuous — this is particularly the case at the two extremes of the scale: small amateur FLOSS projects rarely end up paying for tools or services (particularly when they are proprietary), so they don’t have a budget to worry about. In a similar fashion, when you’re working for a huge multinational corporation that literally design their own servers, it’s unlikely that testing end up having a visible monetary cost to the engineers. So I’ll try to explain, but you might find that the metrics I’m describing make no sense to you. If so, I apologize, and might try harder next time, feel free to let me know in a comment.

I’m adding another assumption here: testing is a technique that allows changes to be shipped safely. We want to ship faster, because time is money, and we want to do it while wasting as little resources as possible. These are going to be keywords I’m going to refer back to a few times, and I’m choosing them carefully — my current and former colleagues are probably understanding well how these fit together, but none of these are specific of an environment.

Changes might take a lot of different forms: it might be a change to the code of an application (patch, diff, changelist, …) that needs to be integrated (submitted, merged, landed, …), or it might be a new build of an application, with a new compiler, or new settings, or new dependencies, or it might be a change in the environment of the application. Because of this, shipping also takes a lot of different shapes: you may use it to refer of publishing your change to your own branch of a repository, to the main repository, to a source release, or directly to users.

Speed is also relative, because it depends on what the change is about and what to we mean with shipping. If you’re talking about the time it take you to publish your proposed change, you wouldn’t want to consider a couple of days as a valid answer — but if you’re talking about delivering a new firmware version to all of your users, you may accept even a week’s delay as long as it’s done safely. And that goes similar to cost (since it’s sometimes the same as time): you wouldn’t consider hiring a QA person to test each patch you write for a week — but it makes more sense if you have a whole new version of a complex application.

Stages and Layers

Testing has layers, like onions and orcs, and that these layers are a direct result of the number of different definitions we can attach to the same set of words, in my experience. A rough way to look at it is to consider the (rough) stages that are involved in most complex software projects: someone makes a change to the source code, someone else reviews it, it gets integrated into the project’s source code, then a person that might be one of the two already involved decides to call for a new release cut, and they eventually deliver it to their users. At each of these stages, there’s testing involved, and it’s always slightly different, both in terms of what it does, and what the tradeoffs that are considered acceptable.

I just want to publish my patch!

The first, innermost layer, I think of when it comes to testing is the testing involved in me being able to publish my change — sometimes also referred to as sending it for review. Code review is another useful technique if used well, but I would posit it’s only useful if it focuses on discussing approaches, techniques, design, and so on – rather than style and nitpicks – which also means I would want to be able to send changes for discussion early: the cost of rejecting a sub-optimal change, or at least requesting further edits to it, is proportional to the amount of time you need to spend to get the change out for review.

So what you want at this stage is fast, cheap tests that don’t require specific resources to be ran. This is the place of type-checking tools, linters, and pure, limited unit tests: tests that take a specific input, and expect the output to be either always the same or within well-established parameters. This is also where my first stone in the shoe needs to drop.

The term “change-detector test” is not widely used in public discourse, but it was a handy shorthand in my previous bubble. It refers to tests written in a way that is so tightly coupled with the original function, that you cannot change the original function (even maintaining the API contract) without changing the test. These are an antipattern for most cases — there’s a few cases in which you _really_ want to make sure that if you change anything in the implementation, you go and change the test and explicitly state that you’re okay with changing the measured approach, such as if you mean to have a constant-time calculation.

There are also the all-mocks tests — I have seen these in Python for the most part, but they are not exclusive to it, since any language that has easy mocking and patching facilities can lead to this outcome — and for languages that lack those, overactive dependency injection can give similar results. These tests are set up in such a way that, no matter what the implementation of the interface under test is, it’s going to return you exactly what you set up in the mocks. They are, in my experience, a general waste of time, because they add nothing over not testing the function at all.

So why are people even writing these types of tests? Well, let me be a bit blasphemous here, and call out one of the reasons I have seen used to justify this setup: coverage metrics. Coverage metrics are a way to evaluate whether tests have been written that “cover” the whole of the program. The concept is designed so that you strive to exercise all of the conditional parts of your software during testing, so the goal is to have 100% of the source code “covered”.

Unfortunately, while the concept is a great idea, the execution is often dogmatic, with a straight ratio of expected coverage for every source file. The “incremental coverage” metric is a similar concept that suggests that you don’t want to ever reduce the coverage of tests. Again, a very useful metric to get an idea if the changes are unintentionally losing coverage, but not something that I would consider giving a strict order to.

This is not to mean that coverage metrics are not useful, or that it’s okay to not exercise parts of a program through the testing cycle — I just think that coverage metrics in the innermost layer are disingenuous and sometimes actively harmful, by introducing all-mocks and change-detector tests. I’ll get to where I think they are useful later.

Ideally, I would say that you don’t want this layer of tests to take more than a couple of minutes, with five being on the very high margin. Again, this falls back on the cost of asking changes — if going back to make a “trivial” change would require another round of tests consuming half an hour, there’s an increase chance that the would insist on making that change later, when they’ll be making some other change instead.

As I said earlier, there’s also matters of trade-offs. If the unit testing is such that it doesn’t require particular resources, and can run relatively quickly through some automated system, the cost to the author is reduced, so that a longer runtime is compensated by not having to remember to run the tests and report the results.

Looks Good To Me — Make sure it doesn’t break anything

There is a second layer of testing that fits on top of the first one, once the change is reviewed and approved, ready to be merged or landed. Since ideally your change does not have defects and you want to just make sure of it, you are going to be running this layer of testing once per change you want to apply.

In case of a number of related changes, it’s not uncommon to run this test once per “bundle” (stack, patchset, … terminology changes all the time), so that you only care that the whole stack works together — although I wouldn’t recommend it. Running one more layer of test on top of the changes make it easier to ensure they are independent enough that one of them can be reverted (rolled back, unlanded) safely (or at least a bit more safely).

This layer of tests is what is often called “integration” testing, although that term is still too ambiguous to me. At this layer, I would be caring to make sure that the module I’m changing still exposes an interface and a behaviour consistent with the expectation from the consumer modules, and still consumes data as provided by its upstream interfaces. Here I would avoid mocks unless strictly required, and rather prefer “fakes” — with the caveat that sometimes you want to use the same patching techniques as used with mocks, particularly if your interface is not well suited for dependency injection.

As long as these tests are made asynchronous and reliable, they can take much longer than the pre-review unit tests — I have experience environments in which the testing after approval and before landing take over half hour, and it’s not that frustrating… as long as they don’t fail for reasons outside of your control. This usually comes down to handling being able to have confidence in sequencing solutions and the results of the tests — nothing is more frustrating than waiting for two hours to land a change just to be told “Sorry, someone else landed another change in the meantime that affects the same tests, you need to restart your run.”

Since the tests take longer, this layer has more leeway in what it can exercise. I personally would strictly consider network dependencies off-limits: as I said above you want to have the confidence in the result, and you don’t want that your change failed to merge because someone was running an update on the network service you rely upon, dropping your session.

So instead, you look for fakes that can implement just enough of the interaction to provide you with signal while still being under your control. To make an example, consider an interface that takes some input, processes it and then serializes some data into a networked datastore: the first layer unit test would focus on making sure that the input processing is correct, and that the resulting structure contains the expected data given a certain input; this second layer of tests would instead ask to serialize the structure and write it to the datastore… except that instead of the real datastore dependency, you mock or inject a fake one.

Depending on the project and the environment, this may be easier said than done, of course. In big enterprises it isn’t unexpected for a team providing a networked service to also maintain a fake implementation of it. Or at least maintain an abstraction that can be used both with the real distributed implementation, and with a local, minimal version. In the case of a datastore, it would depends on how it’s implemented in the first place: if it’s a distributed filesystem, its interface might just be suitable to use both with the network path and with a local temporary path; if it’s a SQL database, it might have an alternative interface using SQLite.

For FLOSS projects this is… not always an easy option. And this gets even worse when dealing with hardware. For my glucometerutils project, I wouldn’t be able to use fake meters — they are devices that I’m accessing, after all, without the blessing of their original manufacturer. On the other hand, if one of them was interested in having good support for their device they could provide a fake, software implementation of it, that the tool can send commands to and explore the results of.

This layer can then verify that your code is not just working, but it’s working with the established interfaces of its upstreams. And here is where I think coverage metrics are more useful. You no longer need to mock all the error conditions upstream is going to give you for invalid input — you can provide that invalid input and make sure that the error handling is actually covered in your tests.

Because the world is made of trade offs, there’s more trade offs to be made here. While it’s possible to run this layer of tests for a longer time than the inner layer, it’s still often not a good idea to run every possible affected test, particularly when working in a giant monorepo, and on core libraries. In these situations an often used trade off has most changes going through a subset of tests – declared as part of the component being changed – with the optional execution of every affected test. It relies on manually curated test selection, as well as a comprehensive dependency tracking, but I can attest that it scales significantly better than running every possibly affected test all the time.

Did we all play well together?

One layer up, and this is what I call Integration Testing. In this layer, different components can (and should) be tested together. This usually means that instead of using fakes, you’re involving networked services, and… well, you may actually have flakes if you are not resilient to network issues.

Integration testing is not just about testing your application, but it’s also testing that the environment around it works along with it. This brings up an interesting set of problems when it comes to ownership. Who owns the testing? Well, in most FLOSS projects the answer is that the maintainers of a project own the testing of their project, and their project only. Most projects don’t really go out of their way to try to and figure out if the changes to their main branch cause issues to their consumers, although a few, at least when they are aware that the changes may break downstream consumers, might give it a good thought.

In bigger organizations, this is where things become political, particularly when monorepos are involved — that’s because it’s not unreasonable for downstream users to always run their integration tests against the latest available version of the upstream service, which is more likely to bump into changes and bugs of the upstream service than the system under actual test (at least after the first generation of bugs and inconsistencies is flattened out).

As you probably noticed by now, going up the layers also means going up in cost and time. Running an integration test with actual backends is no exception to this. You also introduce a flakiness trade-off — you could have an integration test that is always completely independent between runs, but to do so you may need to wait for a full bring-up of a test environment at each run; or you could accept some level of flakes, and just reuse a single test environment setup. Again, this is a matter of trade-offs.

The main trade-off to be aware of is the frequency of certain type of mistakes over others. The fastest tests (which in Python I’d say should be type checking rather than “actual testing”) should be covering mainly the easy-to-make mistakes (e.g. bytes vs str), while the first layer of testing should cover the interfaces that are the easiest to get wrong. Each layer of tests take more time and more resources than the one below, and so it should be run less often — you don’t want to run the full integration tests on drafts, but also you may not be able to afford running it on each submitted change — so maybe you batch changes to test, and reduce the scope of the failure within a few dozens.

But what it if it does fail, and you don’t know which one of the dozen broke it? Well, that’s something you need to get an answer for yourself — in my experience, what makes it easy at this point is not allowing further code changes to be landed until the culprit change is found, and only using revisions that did pass integration testing as valid “cutting points” for releases. And if your batch is small enough, it’s much faster to have a bisection search between the previous run and the current.

If It Builds, Ship It!

At this point, you may think that testing is done: the code is submitted, it passed integration testing, and you’re ready to build a release — which may again consists on widely different actions: tag the repository, build a tarball of sources, build an executable binary, build a Docker image, …

But whatever comes here, there’s a phase that I will refer to as qualifying a release (or cut, or tag, or whatever else). And in a similar fashion as to what I did in Gentoo, it’s not just a matter to make sure that it builds (although that’s part of it, and that by itself should be part of the integration tests), it also needs to be tested.

From my experience here, the biggest risk at this stage is to make sure that the “release mode” of an application works just as well as the “test mode”. This is particularly the case with C and other similar languages in which optimizations can lead to significantly different code being executed than in non-optimized code — this is, after all, how I had to re-work unpaper tests. But it might also be that the environments used to build the integration testing and the final releases are different, and because of that the results are different with that.

Again, this will take longer — although this time it’s likely that the balance of time spent would be on the build side rather than the execution time: optimizing a complex piece of software into a final released binary can be intensive. This is the reason why I would expect that test and release environments wouldn’t be quite the same, and the reason why you need a separate round of testing when you “cut” a release somehow.

Rollin’

That’s not the last round of “testing” that is involved in a full, end-to-end, testing view: when a release is cut, it needs to be deployed – rolled out, published, … – and that in general needs some verification. That’s because even though all of the tests might have passed perfectly fine, they never hit their actual place in a production environment.

This might sound biased towards distributed systems, such as cloud offerings and big organizations like my current and previous employers, but you have the same in a number of smaller environments too: you may have tested something in the staging environment as part of release testing, but are you absolutely certain that the databases running the production environment are not ever so slightly different? Maybe it’s a different user that typed in the schema creation queries, or maybe the hostname scheme between the two is such that there’s an unexpected character in the latter that crashes your application at startup.

This layer of testing is often referred to as healthchecks, but the term has some baggage so I wouldn’t stay too attached to it. In either case, while often these are not considered tests per-se, but rather part of monitoring, I still consider them part of the testing layers. That is also because, if a system is sufficiently complex and critical, you may implement them exactly as part of testing, by feeding it a number of expected requests and observe the results.

Final Thoughts

Testing is a complicated matter, and I’m not promising I gave you any absolute truth that will change your life or your professional point of view. But I hope this idea of “layering” testing, and understanding that different interactions can only be tested at different layers, will give you something to go by.

Senior Engineering: Give Them Time To Find Their Own Stretch

This blog post is brought to you by a random dream I had in December.

As I reflect more on what it means for me to be considered a senior engineer, I find myself picking up answers from my own dreams as well. In a dream this past December, I felt myself being at the office (which, given the situation, is definitely a dream state) but also in particular I found that I was hosting a number of people from my old high school.

As it turns out it’s well possible that the reason why I dreamt of my old high school is because I was talking with my wife about how I would like there to be an organization that would reach out in schools such as that one. But that’s not what the dream was about. And it probably was helped along by having seen a Boston Legal episode about high schools the night before.

In the dream, I was being asked by one of the visitors what would I would recommend them to do, so that more of their students would be successful, and work at big companies. Small aside: there were three of us from my high school, and my year, working in Google when I joined – as I understand one of them was in New York and then left, but I was never really in touch with him – the other guy was actually in Dublin the same as me for a while, and I knew him well enough that I visited him when he moved away.

What I answered was something along the lines of: just don’t fill them of homework that takes them time. They won’t need the homework. They need to find something they can run with. Give them time to find their own stretch.

This is likely part of my personal, subjective experience. I generally hated homework from the beginning, and ignored most of it most of the time, if I could get away with it (and sometimes even when I couldn’t). Except when the homework was something that I could build upon myself. So while on most of the days I would be coming home from school and jump on Ultima OnLine right after lunch, other times I would be toying with expanding on what my programming teacher gave me or, as I told already, with writing a bad 8086 emulator.

But it reminded me of the situation at work, as well — “stretch” being a common word used to refer to the work undertaken by tech employees outside of their comfort zone or level, to “reach” their next goal or level. This plays especially into big tech companies, where promotions are given to those who already act as the next level they’re meant to reach.

While thankfully I only experienced this particular problem first hand a few times, and navigated most of them away, this is not an uncommon situation. My previous employer uses quarterly OKR planning to set up the agenda for each quarter, and particularly in my organization, the resulting OKRs tended to over-promise, by overcommitting the whole team by design. That meant taking on enough hi-priority tasks (P0 as they are usually referred to) to take the whole time of many engineers just to virtually keep the lights on.

Most of this is a management problem to solve, particularly when it came to setting expectations with partner teams about the availability of the engineers who could be working on the projects, but there’s a number of things that senior engineers should do, in my opinion, to prevent this from burning junior engineers.

When there’s too many “vertical” stakeholders, I still subscribe to the concept of diagonal contributions – work on something that brings both you, and many if not all of your verticals, in a better place, even when you could bring a single vertical to a much better place ignoring all the others – but for that to work out well, you need to make the relationship explicit, having buy-in from the supported teams.

The other important point is just to make sure to “manage upwards”, as I’ve been repeatedly told to do in the previous bubble, making sure that, if you have juniors in your team, they are given the time for training, improving, collaborating. That means being able to steer your manager away from overcommitting those who need more time to ramp up. It’s not a hypothetical — with my last manager at my previous role, I had to explicitly say a few times “No, I’m not going to delegate this to Junior Person, because they are already swamped, let them breathe.”

As I said previously, I think that it’s part and parcel of a senior engineer job to let others do the work, and not being the one concentrating the effort — so how do you reconcile the two requirements? I’m not suggesting it’s easy, and it definitely varies situation by situation. In the particular case I was talking about, the team was extremely overcommitted, and while I could have left a task for Junior Person to work on for a quarter or maybe two, it would have gone to be an urgent task afterwards, as it was blocking a migration from another team. Since it would have taken me a fraction of the time, I found it more important for Junior Person to find something that inspired them, rather than something that our manager decided to assign.

Aside: a lot can be written about the dynamics of organizations where managers get to say “I know they said the deadline is Q4, but nobody believes that, we’ll work on it in Q1 next year.” The technical repercussion of this type of thinking is that migrations drag along a lot longer than intended and planned, and supporting the interim state makes them even more expensive on both the drivers and receivers of the migration itself. Personally, if I’m told that Q4 is a deadline, I always tried to work towards migrating as many of the simple cases by Q3. That allowed me identifying cases that wouldn’t be simple, making requests to the migration drivers if needed, and left Q4 available for the complex cases that needed a lot more work. If you don’t do that, you risk assuming that something is a simple case up until the deadline, and then find out it’s another quarter worth of project work to address it.

Of course, a lot of it comes down to incentives, as always. My understanding of the state of it before I left my previous company is that the engineers are not evaluated directly on their OKR completion, but managers are evaluated on their team’s. Making that clear and explicit was itself a significant life improvement I would say — particularly because this didn’t use to be the case, and I was not alone in having seen the completion rate as being a fantasy number that didn’t really do much — much has been written by others on OKR and planning, and I’m no expert at that, so I won’t go into details of how I think about this altogether.

Keeping this in mind, another factor in providing stretch to more junior engineers, is to provide them with goals that are flexible enough for them to run with something they found interesting — leaving your task as the senior person to make sure the direction is the right one. To give an example, this means that instead of suggesting one specific language for implementing a tool, you should leave it up to the engineer working on it to choose one — within the language choice the team has accept, clearly.

It also means not requiring specific solutions to problems, when the solution is not fully fleshed out already. Particularly when looking at problems that apply to heterogenous systems, with different teams maintaining components, it’s easy to fall into the trap of expecting the solution to be additive – write a new component, define a new abstraction layer, introduce a compatibility library – ignoring subtractive solutions — remove the feature that’s used once every two years, delete the old compatibility layer that is only used by one service that hasn’t been ported yet. Focusing on what the problem to be solved is, rather than how it needs to be solved, can give plenty of room to stretch.

At the very least this worked for me — one of the tasks I’ve been given in my first team was to migrate between two different implementations of a common provider. While I did do that to “stop the fire”, I also followed up by implementing a new abstraction on top of it, so I could stop copy-pasting the same the code in ten different applications. I maintained that abstraction until I left the company, and it went from something used within my team only, to the suggested solution to that class of problems.

The other part that I want to make sure people consider, is that deciding that what you were assigned is not important and that there’s better things to do, is not a bad thing. Trying to make it harder to point out “Hey, I was assigned (or picked up) this goal, but after talking with stakeholders directly there’s no need to work on it” is just going to lead to burnout.

To go back to the continuous example of my last team at the previous employer, I found myself having to justify myself for wanting to drop one of the goals from my OKR selection – complicated by the fact that OKR completion was put on my performance plan – despite the fact that the area expert already signed off on it. The reason? We were in a month already in the quarter, and it seemed to late in the game to change our mind. Except the particular goal was not the highest priority, and the first month of the quarter was pretty much dedicated full time to one that was. Once I started poking at the details of this lower priority goal, I realized it would not gain anything for anyone – it was trading technical debt in one place to technical debt in another – while we had a more compelling problem on a different project altogether.

Again, all of this does not lead to a cookie-cutter approach — even senior engineers joining a new space might need to be told what to do so that they are not just wasting air for a while. But if you’re selecting tasks for others to work on, make sure that they get an opportunity to extend on it. Don’t give them a fully-detailed “just write the code for this very detailed specification” if you can avoid it, but rather tell them “solve this problem, here’s the tools, feel free to ask for details”.

And make sure you are around to be asked questions, and to coach, and to show what they may not know. Again from personal experience, if someone comes to you saying that they’d like advice on how to write design docs that pass arbitrarily complicated review processes, the answer “Just write more of them” is not useful. Particularly when the process involves frustrating, time-consuming synchronous reviews with people wanting you to know the answer to all their “What if?” questions without having heard them before.

In short, if you want your juniors to grow and take your place – and in my opinion, you should want that – then you should make sure not to manage their time to the detail. Let them get passionate about something they had to fix, and own seeing it fixed.

Senior Engineering: Open The Door, Move Away

As part of my change of bubble this year, I officially gained the title of “Senior” Engineer. Which made me take the whole “seniority” aspect of the job with more seriousness than I did before. Not because I’m aiming at running up the ladder of seniority, but because I feel it’s part of the due diligence of my job.

I have had very good examples in front of me for most of my career — and a few not great ones, if I am to be honest. And so I’ve been trying to formulate my own take of a senior engineer based on these. You may have noticed me talking about adjacent topics in my “work philosophy” tag. I also have been comparing this in my head with my contributions to Free Software, and in particular to Gentoo Linux.

I have retired from Gentoo Linux a few years ago, but realistically, I’ve stopped being actively involved in 2013, after joining the previous bubble. Part of it was a problem with contributing, part of it was a lack of time, and part of it was having the feeling that something was off. I start to feel I have a better impression now of what it is, and it relates to that seniority that I’m reflecting on.

You see, I worked on Gentoo Linux a little longer than I worked at the previous bubble, and as such I could say that I became a “senior developer” by tenure, but I didn’t really gain the insight to become a “senior developer” in deeds, and this is haunting me because I feel it was a wasted opportunity, despite the fact that it taught me many of the things that I needed to be even barely successful in my current job.

It Begins Early

My best guess is that I started working on Gentoo Linux when I was fairly young and with pretty much no social experience. Which combined with the less-than-perfect work environment of the project, had me develop a number of bad habits that took a very long time to grow out of. That is not to say that age by itself is a significant factor in this — I still resent the remark from one of the other developers that not having kids would make me a worse lead. But I do think that if I didn’t grow up to stay by myself in my own world, maybe I would have been able to do a better job.

I know people my age and younger that became very effective leaders years ago — they’ve got the charisma and the energy to get people on board, and to have them all work for a common goal in their own way. I don’t feel like I ever managed that, and I think it’s because for the longest time, the only person who I had to convince to do something was… myself.

I grew up quite lonely — in elementary school, while I can stay I did have one friend, I didn’t really join other kids It’s a bit of a stereotype for the lonely geek, but I have been made fun since early on about my passion for computers, and for my dislike of soccer – I feel a psychiatrist would have a field day to figure out that and the relationship with my father – and I failed at going to church and Sunday school, which ones the only out-of-school mingling for most of the folks around.

Nearly thirty years later I can tell you that the individualism that I got out of this, while having given me a few headstarts in life when it comes to technical knowledge, it held me back long term on the people skill needed to herd the cats and multiply my impact. It’s not by chance that I wrote about teamwork and, without using the word, individualism.

Aside: I’m Jealous of Kids These Days

As an unrelated aside, this may be the reason why I don’t have such a negative view of social networks in general. It was something I was actually asked when I switched jobs, on what my impression of the current situation is… and my point rolls back to that: when I was growing up we didn’t have social networks, Internet was a luxury, and while, I guess, BBSes were already a thing, they would still have been too expensive for me to access. So it took me until I managed to get an Internet connection and discover Usenet.

I know there’s a long list of issues with all kind of social networks: privacy, polarisation, fake news, … But at the same time I’m glad that it makes it much more approachable for kids nowadays, who don’t fit with the crowd in their geographical proximity, to reach out to friendlier bunches. Of course it’s a double-edged sword as it also allows for bullies to bully more effectively… but I think that’s much more of a society-at-large problem.

The Environment Matters

Whether we’re talking about FLOSS projects, or different teams at work, the environment around an individual matter. That’s because the people around them will provide influence, both positive and negative. In my case, with hindsight, I feel I hanged around the wrong folks too long, in Gentoo Linux, and later on.

While a number of people I met on the project have exerted, again with hindsight, a good, positive influence in my way of approaching the world, I also can tell you now that there’s some “go-to behaviours” that go the wrong way. In particular, while I’ve always tended to be sarcastic and an iconoclast, I can tell you that in my tenure as a Gentoo Linux developer I crossed the line from “snarky” to “nasty” a lot of times.

And having learnt to avoid that, and keeping in check how close to that line I get, I also know that it is something connected to the environment around me. In my previous bubble, I once begged my director to let me change team despite having spent less than the two years I was expected to be on it. The reason? I caught myself becoming more and more snarky, getting close to that line. It wouldn’t have served either me or the company for me to stay in that environment.

Was it a problem with the team as a whole? Maybe, or maybe I just couldn’t fit into it. Or maybe it was a single individual that fouled the mood for many others. Donnie’s talk does not apply only to FLOSS projects, and The No Asshole Rule is still as relevant a book as ever in 2020. Just like in certain projects, I have seen teams in which certain areas were explicitly walked away from by the majority of the engineers, just to avoid having to deal with one or two people.

Another emergent behaviour with this is the “chosen intermediate person” — which is a dysfunction I have seen in multiple projects, and teams. When a limited subset of team members are used to “relate” to another individual either within, or outside, the team. I have been that individual in the first year of high school, with the chemistry teacher — we complained loudly about her being a bad teacher, but now I can say that she was probably a bigger expert in her field than most of the other chemistry teachers in the school, but she was terrible with people. Since I was just as bad, it seemed like I was the best interface with her, and when the class needed her approval to go on a fieldtrip, I was “volunteered” to be the person going.

I’ll get back later on a few more reasons why tolerating “brilliant but difficult to work with” people in a project or team is further unhealthy, but I want to make a few more points here, because this can be a contentious topic due to cultural differences. I have worked with a number of engineers in the past that would be described as assholes by some, and grumpy by others.

In general, I think it’s worth giving a benefit of the doubt to people, at first — but make sure that they are aware of it! Holding people to standards they are not aware of, and have no way to course-correct around, is not fair and will stir further trouble. And while some level of civility can be assumed, in my experience projects and teams that are heavily anglophones, tend to assume a lot more commonality in expectation than it’s fair to.

Stop Having Heroes

One of the widely known shorthands at the old bubble was “no heroes” — a reference to a slide deck from one of the senior engineers in my org on the importance of not relying on “heroes” looking after a service, a job, or a process. Individuals that will step in at any time of day and night to solve an issue, and demonstrate how they are indispensable for the service to run. The talk is significantly more nuanced than my summary right now, so take my words with a grain of salt of course.

While the talk is good, I have noticed a little too often the shorthand used to just tell people to stop doing what they think is the right thing, and leave rakes all around the place. So I have some additional nuances for it of my own, starting with the fact that I find it a very bad sign when a manager uses the shorthand with their own reports — that’s because one of my managers did exactly that, and I know that it doesn’t help. Calling up “no heroes” practice between engineers is generally fair game, and if you call up on your own contributions, that’s awesome, too! «This is the last time I’m fixing this, if nobody else prioritizes this, no heroes!»

On the other hand, when it’s my manager telling me to stop doing something and “let it break”, well… how does that help anyone? Yes, it’s in the best interest of the engineer (and possibly the company) for them not to be the hero that steps in, but why is this happening? Is the team relying on this heroism? Is the company relying on it? What’s the long-term plan to deal with that? Those are all questions that the manager should at least ask, rather than just tell the engineer to stop doing what they are doing!

I’ve been “the hero” a few times, both at work and in Gentoo Linux. It’s something I always have been ambivalent about. From one side, it feels good to be able to go and fix stuff yourself. From the other hand, it’s exhausting to feel like the one person holding up the whole fort. So yes, I totally agree that we shouldn’t have heroes holding up the fort. But since it still happens, it can’t be left just up to an individual to remember to step back at the right moment to avoid becoming a hero.

In Gentoo Linux, I feel the reason why we ended up with so many heroes was the lack of coordination between teams, and the lack of general integration — the individualism all over again. And it reminds me of a post from a former colleague about Debian, because some of the issues (very little mandated common process, too many different ways to do the same things) are the kind of “me before team” approaches that drive me up the wall, honestly.

As for my previous bubble, I think the answer I’m going to give is that the performance review project as I remember it (hopefully it changed in the meantime) should be held responsible for most of it, because of just a few words: go-to person. When looking at performance review as a checklist (which you’re told not to, but clearly a lot of people do), at least for my role, many of the levels included “being the go-to person”. Not a go-to person. Not a “subject matter expert” (which seems to be the preferred wording in my current bubble). But the go-to person.

From being the go-to person, to being the hero, and build up a cult of personality, the steps are not that far. And this is true in the workplace as well as in FLOSS projects — just think, and you probably can figure out a few projects that became synonymous with their maintainers, or authors.

Get Out of The Way

What I feel Gentoo Linux taught me, and in particular leaving Gentoo Linux taught me, is that the correct thing for a senior engineer to do is to know when to bow out. Or move onto a different project. Or maybe it’s not Gentoo Linux that taught me that.

But in general, I still think this is the most important lesson is to know how to open the door and get out of the way. And I mean it, that both parts are needed. It’s not just a matter of moving on when you feel like you’ve done your part — you need to be able to also open the door (and make sure it stays open) for the others to pass through it, as well. That means planning to get out of the way, not just disappearing.

This is something that I didn’t really do well when I left Gentoo Linux. I While I eventually did get out of the way, I didn’t really fully open the door. I started, and I’m proud of that, but I think I should have done this better. The blogs documenting how the Tinderbox worked, as well as the notes I left about things like the USE-based Ruby interpreter selection, seems to have been useful to have others pick up where i left… but not in a very seamless way.

I think I did this better when I left the previous bubble, by making sure all of the stuff I was working on had breadcrumbs for the next person to pick up. I have to say it did make me warm inside to receive a tweet, months after leaving, from a colleague announcing that the long-running deprecation project I’ve worked on was finally completed.

It’s not an easy task. I know a number of senior engineers who can’t give up their one project — I’ve been that person before, although as I said I haven’t really considered myself a “senior” engineer before. Part of it is wanting to be able to keep the project working exactly like I want it to, and part of it is feeling attached to the project and wanting to be the person grabbing the praise for it. But I have been letting go as much as I could of these in the past few years.

Indeed, while some projects thrive under benevolent dictators for life, teams at work don’t tend to work quite as well. Those dictators become gatekeepers, and the projects can end up stagnating. Why does this happen more at work than in FLOSS? I can only venture a guess: FLOSS is a matter of personal pride — and you can “show off” having worked on someone else’s project at any time, even though it might be more interesting to “fully make the project one’s own”. On the other hand, if you’re working at a big company, you may optimise working on projects where you can “own the impact” for the time you bring this up to performance review.

The Loadbearing Engineer

When senior engineers don’t move away after opening the door, they may become “loadbearing” — they may be the only person knowing how something works. Maybe not willingly, but someone will go “I don’t know, ask $them” whenever a question about a specific system comes by.

There’s also the risk that they may want to become loadbearing, to become irreplaceable, to build up job security. They may decide not to document the way certain process runs, the reason why certain decisions were made, or the requirements of certain interfaces. If you happen to want to do something without involving them, they’ll be waiting for you to fail, or maybe they’ll manage to stop you from breaking an important assumption in the system at the last moment. This is clearly unhealthy for the company, or project, and risky for the person involved, if they are found to not be quite as indispensable.

There’s plenty of already written on the topic of bus factor, which is what this fits into. My personal take on this is to make sure that those who become “loadbearing engineers” are made sure to be taking at least one long vacation a year. Make sure that they are unreachable unless something goes very wrong, as in, business destroying wrong. And make sure that they don’t just happen to mark themselves out of office, but still glued to their work phone and computer. And yes, I’m talking about what I did to myself a couple of times over my tenure at the previous bubble.

That is, more or less, what I did by leaving Gentoo as well — I’ve been holding the QA fort so long, that it was a given that no matter what was wrong, Flameeyes was there to save the day. But no, eventually I wasn’t, and someone else had to go and build a better, scalable alternative.

Some of This Applies to Projects, Too

I don’t mean it as “some of the issues with engineers apply to developers”. That’s a given. I mean that some of the problems happen to apply to the projects themselves.

Projects can become the de-facto sole choice for something, leaving every improvement behind, because nobody can approach them. But if something happens, and they are not updated further, it might just give it enough of a push that they can get replaced. This has happened to many FLOSS projects in the past, and it’s usually a symptom of a mostly healthy ecosystem.

We have seen how XFree86 becoming stale lead to Xorg being fired up, which in turn brought us a significant number of improvements, from the splitting apart of the big monolith, to XCB, to compositors, to Wayland. Apache OpenOffice is pretty much untouched for a long time, but that gave us LibreOffice. GCC having refused plugins for long enough put more wood behind Clang.

I know that not everybody would agree that the hardest problems in software engineering are people problems, but I honestly have that feeling at this point.

Computer-Aided Software Engineering

Fourteen years ago, fresh of translating Ian Sommerville’s Software Engineering (no, don’t buy it, I don’t find it worth it), and approaching the FLOSS community for the first time, I wrote a long article for the Italian edition of Linux Journal on Computer-Aided Software Engineering (CASE) tools. Recently, I’ve decided to post that article on the blog, since the original publisher is gone, and I thought it would be useful to just have it around. And because the OCR is not really reliable, I ended up having to retype a good chunk of it.

And that reminded me of how, despite me having been wrong a lot of times before, I still think some ideas stuck with me and I still find them valid. CASE is one of those, even though a lot of times we’re not really talking of the tools involved as CASE.

UML is the usual example of a CASE tool — it confuses a lot of people because the “language” part suggests it’s actually used to write programs, but that’s not what it is for: it is a way to represent similar concepts in similar ways, without having to re-explain the same iconography: sequence diagrams, component diagrams, entity-relationship diagrams standardise the way you express certain relationship and information. That’s what it is all about — and while you could draw all of those diagrams without any specific tool, with either LibreOffice Draw, or Inkscape, or Visio, specific tools for UML are meant to help (aid) you with the task.

My personal preferred tool for UML is Visual Paradigm, which is a closed-source, proprietary solution — I have not found a good open source toolkit that could replace it. PlantUML is an interesting option, but it doesn’t have nearly all the aid that I would expect form an actual UML CASE tool — you can’t properly express relationships between different components across diagrams, as you don’t have a library of components and models.

But setting UML aside, there’s a lot more that should fit into the CASE definition. Tools for code validation and review, which are some of my favourite things ever, are also aids to software engineering. And so are linters, formatters, and sanitizers. It’s easy to just call them “dev tools”, but I would argue that particularly when it comes to automating the code workflows, it makes sense to consider them CASE tools, and reduce the stigma attached to the concept of CASE, particularly in the more “trendy” startups and open source — where I still feel push backs at using UML, or auto-formatters, and integrated development environments.

Indeed, for most of these tools, they are already considered their own category: “developer productivity”. Which is not wrong, but it does reduce significantly the impact they have — it’s not just about developers, or coders. I like to say that Software Engineering is a teamwork practice, and not everybody on a Software Engineering team would be a coder — or a software engineer, even.

A proper repository of documents, kept up to date with the implementation, is not just useful for the developers that come later, and need to implement features that integrate with the already existing system. It’s useful for the SRE/Ops folks who are debugging something on fire, and are looking at the interaction between different components. It’s useful to the customer support folks who are being asked why only a particular type of requests are failing in one of the backends. It’s useful to the product managers to have clear which use cases are implemented for the service, and which components are involved in specific user journeys.

And similarly it extends for other type of tools — A code review tool that can enforce updates to the documentation. A dependency tracking system that can match known vulnerabilities. A documentation repository that allows full reviews. An issue tracker system that can identify who most recently changed code that affects the component an issue was filed on.

And from here you can see why I’m sceptical about single-issue tools being “good enough”. Without integration, these tools are only as useful as the time they save, and often that means they are “negative useful” — it takes time to set up the tools, to remember to run them, and to address their concern. Integrated tools instead can provide additional benefits that go beyond their immediate features.

Take a linter as an example: a good linter with low false positive rate is a great tool to make sure your code is well written. But if you have to manually run it, it’s likely that, in a collaborative project, only a few people will be running it after each change, slowing them down, while not making much of a difference for everyone else. It gets easier if the linter is integrated in the editor (or IDE), and even easier if it’s also integrated as part of code review – so those who are not using the same editor can still be advised by it – and it’s much better if it’s integrated with something like pre-commit to make it so the issues are fixed before the review is sent out.

And looking at all these pieces together, the integrations, and the user journeys, that is itself Software Engineering. FLOSS developers in general appears to have built a lot of components and tools that would allow building those integrations, but until recently I would have said that there’s been no real progress in making it proper software engineering. Nowadays, I’m happy to see that there is some progress, even as simple as EditorConfig, to avoid having to fight over which editors to support in a repository, and which ones not to.

Hopefully this type of tooling is not going to be relegated to textbooks in the future, and we’ll also be used to have a bunch of CASE tools in our toolbox, to make software… better.

The Rolodex Paradigm

Silhouette of a rolodex.

Created by Marie-Pierre Bauduin from Noun Project.

In my previous bubble, I used to use as my “official” avatar a clipart picture of a Rolodex. Which confused a lot of people, because cultures differ and most importantly generation differ, and turned out that a lot of my colleagues and teammates never had seen or heard of a Rolodex. To quote one of the managers of my peer team when my avatar was gigantic on the conference room’s monitor «You cannot say that you don’t know what a Rolodex is, anymore!»

So, what is a Rolodex? Fundamentally, it’s a fancy address book. Think of it as a physical HyperCard. As Wikipedia points out, though, the name is sometimes used «as a metonym for the sum total of an individual’s accumulated business contacts», which is how I’m usually using it — the avatar is intentionally tongue-in-cheek. Do note that this is most definitely not the same as a Pokédex.

And what I call the Rolodex Paradigm is mainly the idea that the best way to write software is not to know everything about everything, but to know who knows something about what you need. This is easier said than done of course, but let me try to illustrate why I mean all of this.

One of the things I always known about myself is that I’m mainly a generalist. I like knowing a little bit about a lot of things, rather than a lot about a few things. Which is why on this blog you’ll find superficial posts about fintech, electronics, the environment, and cloud computing. You’ll rarely find in-depth articles about anything more recently because to get into that level of details I would need to get myself “in the zone” and that is hardly achievable while maintaining work and family life.

So what do I do when I need information I don’t have? I ask. And to do that, I try to keep in mind who knows something about the stuff that interest me. It’s the main reason why I used to use IRC heavily (I’m still around but not active at all), the main reason why I got to identi.ca, the main reason why I follow blogs and write this very blog, and the main reason why I’m on social networks including Twitter and Facebook – although I’ve switched from using my personal profile to maintaining a blog’s page – and have been fairly open with providing my email address to people, because to be able to ask, you need to make yourself available to answer.

This translates similarly in the workplace: when working at bigger companies that come with their own bubble, it’s very hard to know everything of everything, so by comparison it can be easier to build up a network of contacts who work on different areas within the company, and in particular, not just in engineering. And in a big company it even has a different set of problems to overcome, compared to the outside, open source world.

When asking for help to someone in the open source world, you need to remember that nobody is working for you (unless you’re explicitly paying them, in which case it’s less about asking for help and more about hiring help), and that while it’s possible that you’re charismatic enough (or well known enough) to pull off convincing someone to dedicate significant amount of time to solve your problems, people are busy and they might have other priorities.

In a company setting, there’s still a bit of friction of asking someone to dedicate a significant amount of time to solve your problem rather than theirs. But, if the problem is still a problem for the company, it’s much more likely that you can find someone to at least consider putting your problem in their prioritised list, as long as they can show something for the work done. The recognition is important not just as a way to justify the time (which itself is enough of a reason), but also because in most big companies, your promotion depends on demonstrating impact in one way or another.

Even were more formal approaches to recognitions (such as Google’s Peer Bonus system) are not present, consider sending a message to the manager of whoever helped you. Highlight how they helped not just you personally, but the company — for instance, they may have dedicated one day to implement a feature in their system that saved you a week or two of work, either by implementing the same feature (without the expertise in the system) or working around it; or they might have agreed to get to join a sketched one hour meeting to provide insights into the historical business needs for a service, that will stop you from making a bad decision in a project. It will go a long way.

Of course another problem is to find the people who know about the stuff you need — particularly if they are outside of your organization, and outside of your role. I’m afraid to say that it got a lot harder nowadays, given that we’re now all working remote from different houses and with very little to no social overlapping. So this really relies significantly on two points: company culture, and manager support.

From the company point of view, letting employees built up their network is convenient. Which is why so many big companies provide spaces for, and foster, interaction between employees that have nothing to do with work itself. While game rooms and social interactions are often sold as “perks” to sell roles, they are pretty much relaxed “water cooler” moments, that build those all-too-precious networks that don’t fit into an org structure. And that’s why inclusive social events are important.

So yeah, striking conversations with virtual stranger employees, talking about common interests (photography? Lego? arts?) can lead into knowing what they are working on, and once they are no longer strangers, you would feel more inclined to ask for help later. The same goes for meeting colleagues at courses — I remember going to a negotiation training based around Stuart Diamond’s Getting More, and meeting one of the office’s administrative business partners, who’s also Italian and liking chocolate. When a few months later I was helping to organize VDD ’14, I asked her help to navigate the amount of paperwork required to get outsiders into the office over a weekend.

Meeting people is clearly not enough, though. Keeping in touch is also important, particularly in companies where teams and role are fairly flexible, and people may be working on very different projects after months or year. What I used to do for this was making sure to spend time with colleagues I knew from something other than my main project when traveling. I used to travel from Dublin to London a few times a year for events — and I ended up sitting close to teams I didn’t work with directly, which lead me to meeting a number of colleagues I wouldn’t have otherwise interacted with at all. And later on, when I moved to London, I actually worked with some of them in my same team!

And that’s where the manager support is critical. You won’t be very successful at growing a network if your manager, for example, does not let you clear your calendar of routine meetings for the one week you’re spending in a different office. And similarly, without a manager that supports you dedicating some more time for non-business-critical training (such as the negotiation training I’ve mentioned), you’ll end up with fewer opportunities to meet random colleagues.

I think this was probably the starkest difference between my previous employer’s offices in Dublin and London: my feeling was that the latter had far fewer opportunities to meet people outside of your primary role and cultivate those connections. But it might also be caused by the fact that many more people live far enough from the office that commuting takes longer.

How is this all going to be working in a future where so many of us are remote? I don’t honestly know. For me, the lack of time sitting at the coffee area talking about things with colleagues that I didn’t share a team with, is one of the main reason why I hope that one day, lockdown will be over. And for the rest, I’m trying to get used to talk over the Internet more.

Newcomers, Teachers, and Professionals

You may remember I had already a go at tutorials, after listening in on one that my wife had been going through. Well, she’s now learning about C after hearing me moan about higher and lower level languages, and she did that by starting with Harvard’s CS50 class, which is free to “attend” on edX. I am famously not a big fan of academia, but I didn’t think it would make my blood boil as much as it did.

I know that it’s easy to rant and moan about something that I’m not doing myself. After all you could say “Well, they are teaching at Harvard, you are just ranting on a c-list blog that is followed by less than a hundred people!” and you would be right. But at the same time, I have over a decade of experience in the industry, and my rants are explicitly contrasting what they say in the course to what “we” do, whether it is in opensource projects, or a bubble.

I think the first time I found myself boiling and went onto my soapbox was when the teacher said that the right “design” (they keep calling it design, although I would argue it’s style) for a single-source file program is to have includes, followed by the declaration of all the functions, followed by main(), followed by the definition of all the functions. Which is not something I’ve ever seen happening in my experience — because it doesn’t really make much sense: duplicating declarations/definitions in C is an unfortunate chore due to headers, but why forcing even more of that in the same source file?

Indeed, one of my “pre-canned comments” in reviews at my previous employer was a long-form of “Define your convenience functions before calling them. I don’t want to have to jump around to see what your doodle_do() function does.” Now it is true that in 2020 we have the technology (VSCode’s “show definition” curtain is one of the most magical tools I can think of), but if you’re anyone like me, you may even sometimes print out the source code to read it, and having it flow in natural order helps.

But that was just the beginning. Some time later as I dropped by to see how things were going I saw a strange string type throughout the code — turns out that they have a special header that they (later) define as “training wheels” that includes typedef char *string; — possibly understandable given that it takes some time to get to arrays, pointers, and from there to character arrays, but… could it have been called something else than string, given the all-too-similarly named std::string of C++?

Then I made the mistake of listening in on more of that lesson, and that just had me blow a fuse. The lesson takes a detour to try to explain ASCII — the fact that characters are just numbers that are looked up in a table, and that the table is typically 8-bit, with no mention of Unicode. Yes I understand Unicode is complicated and UTF-8 and other variable-length encodings will definitely give a headache to a newcomer who has not seen programming languages before. But it’s also 2020 and it might be a good idea to at least put out the idea that there’s such a thing as variable-length encoded text and that no, 8-bit characters are not enough to represent people’s names! The fact that my own name has a special character might have something to do with this, of course.

It went worse. The teacher decided to show some upper-case/lower-case trickery on strings to show how that works, and explained how you add or subtract 32 to go from one case to the other. Which is limited not only by character set, but most importantly by locale — oops, I guess the teacher never heard of the Turkish Four Is, or maybe there’s some lack of cultural diversity in the writing room for these courses. I went on a rant on Twitter over this, but let me reiterate this here as it’s important: there’s no reason why a newcomer to any programming language should know about adding/subtracting 32 to 7-bit ASCII characters to change their case, because it is not something you want to do outside of very tiny corner cases. It’s not safe in some languages. It’s not safe with characters outside the 7-bit safe Latin alphabet. It is rarely the correct thing to do. The standard library of any programming language has locale-aware functions to uppercase or lowercase a string, and that’s what you need to know!

Today (at the time of writing) she got to allocations, and I literally heard the teacher going for malloc(sizeof(int)*10). Even to start with a bad example and improve from that — why on Earth do they even bother teaching malloc() first, instead of calloc() is beyond my understanding. But what do I know, it’s not like I spent a whole lot of time fixing these mistakes in real software twelve years ago. I will avoid complaining too much about the teacher suggesting that the behaviour of malloc() was decided by the clang authors.

Since there might be newcomers reading this and being a bit lost of why I’m complaining about this — calloc() is a (mostly) safer alternative to allocate an array of elements, as it takes two parameters: the size of a single element and the number of elements that you want to allocate. Using this interface means it’s no longer possible to have an integer overflow when calculating the size, which reduces security risks. In addition, it zeroes out the memory, rather than leaving it uninitialized. While this means there is a performance cost, if you’re a newcomer to the language and just about learning it, you should err on the side of caution and use calloc() rather than malloc().

Next up there’s my facepalm on the explanation of memory layout — be prepared, because this is the same teacher who in a previous lesson said that the integer variable’s address might vary but for his explanation can be asserted to be 0x123, completely ignoring the whole concept of alignment. To explain “by value” function calls, they decide to digress again, this time explaining heap and stack, and they describe a linear memory layout, where the code of the program is followed by the globals and then the heap, with the stack at the bottom growing up. Which might have been true in the ’80s, but hasn’t been true in a long while.

Memory layout is not simple. If you want to explain a realistic memory layout you would have to cover the differences between physical and virtual memory, memory pages and pages tables, hugepages, page permissions, W^X, Copy-on-Write, ASLR, … So I get it that the teacher might want to simplify and skip over a number of these details and give a simplified view of how to understand the memory layout. But as a professional in the industry for so long I would appreciate if they’d be upfront with the “By the way, this is an oversimplification, reality is very different.” Oh, and by the way, stack grows down on x86/x86-64.

This brings me to another interesting… mess in my opinion. The course comes with some very solid tools: a sandbox environment already primed for the course, an instance of AWS Cloud9 IDE with the libraries already installed, a fairly recent version of clang… but then decides to stick to this dubious old style of C, with strcpy() and strcmp() and no reference to more modern, safer options — nevermind that glibc still refuses to implement C11 Annex K safe string functions.

But then they decide to not only briefly show the newcomers how to use Valgrind, of all things. They even show them how to use a custom post-processor for Valgrind’s report output, because it’s otherwise hard to read. For a course using clang, that can rely on tools such as ASAN and MSAN to report the same information in more concise way.

I find this contrast particularly gruesome — the teacher appears to think that memory leaks are an important defect to avoid in software, so much so that they decide to give a power tool such as Valgrind to a class of newcomers… but they don’t find Unicode and correctness in names representation (because of course they talk about names) to be as important. I find these priorities totally inappropriate in 2020.

Don’t get me wrong: I understand that writing a good programming course is hard, and that professors and teachers have a hard job in front of them when it comes to explain complex concepts to a number of people that are more eager to “make” something than to learn how it works. But I do wonder if sitting a dozen professionals through these lessons wouldn’t make for a better course overall.

«He who can, does; he who cannot teaches» is a phrase attributed to George Bernand Shaw — I don’t really agree with it as it is, because I met awesome professors and teachers. I already mentioned my Systems’ teacher, who I’m told retired just a couple of months ago. But in this case I can tell you that I wouldn’t want to have to review the code (or documentation) written by that particular teacher, as I’d have a hard time keeping to constructive comments after so many facepalms.

It’s a disservice to newcomers that this is what they are taught. And it’s the professionals like me that are causing this by (clearly) not pushing back enough on Academia to be more practical, or building better courseware for teachers to rely on. But again, I rant on a C-list blog, not teach at Harvard.

Diagonal Contributions

This is a tale that starts on my previous dayjob. My role as an SRE had been (for the most part) one of support, with teams dedicated to developing the product, and my team making sure that it would perform reliably and without waste. The relationship with “the product team” has varied over time and depending on both the product and the SRE team disposition, sometimes in not particularly healthy way either.

In one particular team, I found myself supporting (together with my team) six separate product teams, spread between Shanghai, Zurich and Mountain View. This put particular pressure on the dynamics of the team, particularly when half of the members (based in Pittsburgh) didn’t even have a chance to meet the product team of two services (based in Shanghai), as they would be, in the normal case, 12 hours apart. It’s in this team that I started formulating the idea I keep referring to as “diagonal contributions”.

You see, there’s often a distinction between horizontal and vertical contributions. Vertical referring to improving everything of a service, from the code itself, to its health checks, release, deployment, rollout, … While horizontal referring to improving something of every service, such as making every RPC based server be monitored through the same set of metrics. And there are different schools of thought on which option is valid and which one should be incentivised, and so it usually depends on your manager and their manager on which one of the two approach you’ll be rewarded to take.

When you’re supporting so many different teams directly, vertical contributions are harder on the team overall — when you go all in to identify and fix all the issues for one of the products, you end up ignoring the work needed for the others. In these cases an horizontal approach might pay off faster, from an SRE point of view, but it comes with a cost: the product teams would then have little visibility into your work, which can turn into a nasty confrontation, particularly depending on the management you find yourself dealing with (on both sides).

It’s in that situation that I came up with “diagonal contributions”: improve a pain point for all the services you own, and cover as many services you can. In a similar fashion to rake collection, this is not an easy balance to strike, and it takes experience to have it done right. You can imagine from the previous post that my success at working on this diagonal has varied considerably depending on teams, time, and management.

What did work for me, was finding some common pain points between the six products I supported, and trying to address those not with changes to the products, but with changes to the core libraries they used or the common services they relied upon. This allowed me to show actual progress to the product teams, while solving issues that were common to most of the teams in my area, or even in the company.

It’s a similar thing with rake collection for me: say there’s a process you need to follow that takes two to three days to go through, and four out of your six teams are supposed to go through it — it’s worth it to invest four to six days to reduce the process to something that takes even just a couple of hours: you need fewer net people-days even just looking at the raw numbers, which is very easy to tell, but that’s not where it stops! A process that takes more than a day adds significant risks: something can happen overnight, the person going through the process might have to take a day off, or they might have a lot of meetings the following day, adding an extra day to the total, and so on.

This is also another reason why I enjoy this kind of work — as I said before, I disagree with Randall Munroe when it comes to automation. It’s not just a matter of saving time to do something trivial that you do rarely: automation is much less likely to make one-off mistakes (it’s terrifyingly good at making repeated mistakes of course), and even if it doesn’t take less time than a human would take, it doesn’t take human time to do stuff — so a three-days-long process that is completed by automation is still a better use of time than a two-days-long process that rely on a person having two consecutive days to work on it.

So building automation or tooling, or spending time making it easier to use core libraries, are in my books a good way to make contributions that are more valuable than just to your immediate team, while not letting your supported teams feel like they are being ignored. But this only works if you know which pain points your supported teams have, and you can make a case that your work directly relates to those pain points — I’ve seen situations where a team has been working on very valuable automation… that relieved no pain from the supported team, giving them a feeling of not being taken into consideration.

In addition to a good relationship with the supported team, there’s another thing that helps. Actually I would argue that it does more than just help, and is an absolute requirement: credibility. And management support. The former, in my experience, is a tricky one to understand (or accept) for many engineers, including me — that’s because often enough credibility in this space is related to the actions of your predecessors. Even when you’re supporting a new product team, it’s likely its members have had interactions with support teams (such as SRE) in the past, and those interactions will colour the initial impression of you and your team. This is even stronger when the product team was assigned a new team — or you’re a new member of a team, or you’re part of the “new generation” of a team that went through a bit of churn.

The way I have attacked that problem is by building up my credibility, by listening, and asking questions of what the problems the team feel are causing them issues are. Principles of reliability and best practices are not going to help a team that is struggling to find the time to work even on basic monitoring because they are under pressure to deliver something on time. Sometimes, you can take some of their load away, in a way that is sustainable for your own team, in a way that gains credibility, and that further the relationship. For instance you may be able to spend some time writing the metric-exposing code, with the understanding that the product team will expand it as they introduce new features.

The other factor as I said is management — this is another of those things that might bring a feeling of unfairness. I have encountered managers who seem more concerned about immediate results than the long-term pictures, and managers who appear afraid of suggesting projects that are not strictly within the scope of reliability, even when they would increase the team’s overall credibility. For this, I unfortunately don’t have a good answer. I found myself overall lucky with the selection of managers I have reported to, on average.

So for all of you out there in a position of supporting a product team, I hope this post helped giving you ideas of how to building a more effective, more healthy relationship.

On Rake Collections and Software Engineering

autum, earth's back scratcher

Matthew posted on twitter a metaphor about rakes and software engineering – well, software development but at this point I would argue anyone arguing over these distinctions have nothing better to do, for good or bad – and I ran with it a bit by pointing out that in my previous bubble, I should have used “Rake Collector” as my job title.

Let me give a bit more context on this one. My understanding of Matthew’s metaphor is that senior developers (or senior software engineers, or senior systems engineers, and so on) are at the same time complaining that their coworkers are making mistakes (“stepping onto rakes”, also sometimes phrased as “stepping into traps”), while at the same time making their environment harder to navigate (“spreading more rakes”, also “setting up traps”).

This is not a new concept. Ex-colleague Tanya Reilly expressed a very similar idea with her “Traps and Cookies” talk:

I’m not going to repeat all of the examples of traps that Tanya has in her talk, which I thoroughly recommend for people working with computers to watch — not only developers, system administrators, or engineers. Anyone working with a computer.

Probably not even just people working with computers — Adam Savage expresses yet another similar concept in his Every Tool’s a Hammer under Sweep Up Every Day:

[…] we bought a real tree for Christmas every year […]. My job was always to put the lights on. […] I’d open the box of decorations labeled LIGHTS from the previous year and be met with an impossible tangle of twisted, knotted cords and bulbs and plugs. […] You don’t want to take the hour it’ll require to separate everything, but you know it has to be done. […]

Then one year, […] I happened to have an empty mailing tube nearby and it gave me an idea. I grabbed the end of the lights at the top of the tree, held them to the tube, then I walked around the tree over and over, turning the tube and wrapping the lights around it like a yuletide barber’s pole, until the entire six-string light snake was coiled perfectly and ready to be put back in its appointed decorations box. Then, I forgot all about it.

A year later, with the arrival of another Christmas, I pulled out all the decorations as usual, and when I opened the box of lights, I was met with the greatest surprise a tired working parent could ever wish for around the holidays: ORGANIZATION. There was my mailing tube light solution from the previous year, wrapped up neat and ready to unspool.

Adam Savage, Every Tool’s a Hammer, page 279, Sweep up every day

This is pretty much the definition of Tanya’s cookie for the future. And I have a feeling that if Adam was made aware of Tanya’s Trap concept, he would probably point at a bunch of tools with similar concepts. Actually, I have a feeling I might have heard him saying something about throwing out a tool that had some property that was opposite of what everything else in the shop did, making it dangerous. I might be wrong so don’t quote me on that, I tried looking for a quote from him on that and failed to find anything. But it is something I definitely would do among my tools.

So what about the rake collection? Well, one of the things that I’m most proud of in my seven years at that bubble, is the work I’ve done trying to reduce complexity. This took many different forms, but the main one has been removing multiple optional arguments to interfaces of libraries that would be used across the whole (language-filtered) codebase. Since I can’t give very close details of what’s that about, you’ll find the example a bit contrived, but please bear with me.

When you write libraries that are used by many, many users, and you decide that you need a new feature (or that an old feature need to be removed), you’re probably going to add a parameter to toggle the feature, and either expect the “modern” users to set it, or if you can, you do a sweep over the current users, to have them explicitly request the current behaviour, and then you change the default.

The problem with all of this, is that cleaning up after these parameters is often seen as not worth it. You changed the default, why would you care about the legacy users? Or you documented that all the new users should set the parameter to True, that should be enough, no?

That is a rake. And one that is left very much in the middle of the office floor by senior managers all the time. I have seen this particular pattern play out dozens, possibly hundreds of times, and not just at my previous job. The fact that the option is there to begin with is already increasing complexity on the library itself – and sometimes that complexity gets to be very expensive for the already over-stretched maintainers – but it’s also going to make life hard for the maintainers of the consumers of the library.

“Why does the documentation says this needs to be True? In this code my team uses it’s set to False and it works fine.” “Oh this is an optional parameter, I guess I can ignore it, since it already has a default.” *Copy-pastes from a legacy tool that is using the old code-path and nobody wanted to fix.*

As a newcomer to an environment (not just a codebase), it’s easy to step on those rakes (sometimes uttering exactly the words above), and not knowing it until it’s too late. For instance if a parameter controls whether you use a more secure interface, over an old one you don’t expect new users of. When you become more acquainted with the environment, the rakes become easier and easier to spot — and my impression is that for many newcomers, that “rake detection” is the kind of magic that puts them in awe of the senior folks.

But rake collection means going a bit further. If you can detect the rake, you can pick it up, and avoid it smashing in the face of the next person who doesn’t have that detection ability. This will likely slow you down, but an environment full of rakes slows down all the newcomers, while a mostly rake-free environment would be much more pleasant to work with. Unfortunately, that’s not something that aligns with business requirements, or with the incentives provided by management.

A slight aside here. Also on Twitter, I have seen threads going by about the fact that game development tends to be a time-to-market challenge, that leaves all the hacks around because that’s all you care about. I can assure you that the same is true for some non-game development too. Which is why “technical debt” feels like it’s rarely tackled (also on the note, Caskey Dickson has a good technical debt talk). This is the main reason why I’m talking about environments rather than codebases. My experience is with long-lived software, and libraries that existed for twice as long as I worked at my former employer, so my main environment was codebases, but that is far from the end of it.

So how do you balance the rake-collection with the velocity of needing to get work done? I don’t have a really good answer — my balancing results have been different team by team, and they often have been related to my personal sense of achievement outside of the balancing act itself. But I can at least give an idea of what I do about this.

I described this to my former colleagues as a rule of thumb of “three times” — to keep with the rake analogy, we can call it “three notches”. When I found something that annoyed me (inconsistent documentation, required parameters that made no sense, legacy options that should never be used, and so on), I would try to remember it, rather than going out of my way to fix it. The second time, I might flag it down somehow (e.g. by adding a more explicit deprecation notice, logging a warning if the legacy codepath is executed, etc.) And the third time I would just add it to my TODO list and start addressing the problem at the source, whether it would be within my remit or not.

This does not mean that it’s an universal solution. It worked for me, most of the time. Sometimes I got scolded for having spent too much time on something that had little to no bearing on my team, sometimes I got celebrated for unblocking people who have been fighting with legacy features for months if not years. I do think that it was always worth my time, though.

Unfortunately, rake-collection is rarely incentivised. The time spent cleaning up after the rakes left in the middle of the floor eats into one’s own project time, if it’s not the explicit goal of their role. And the fact that newcomers don’t step into those rakes and hurt themselves (or slow down, afraid of bumping into yet another rake) is rarely quantifiable, for managers to be made to agree to it.

What could he tell them? That twenty thousand people got bloody furious? That you could hear the arteries clanging shut all across the city? And that then they went back and took it out on their secretaries or traffic wardens or whatever, and they took it out on other people? In all kinds of vindictive little ways which, and here was the good bit, they thought up themselves. For the rest of the day. The pass-along effects were incalculable. Thousands and thousands of soul all got a faint patina of tarnish, and you hardly had to lift a finger.

But you couldn’t tell that to demons like Hastur and Ligur. Fourteenth-century minds, the lot of them. Spending years picking away at one soul. Admittedly it was craftsmanship, but you had to think differently these days. Not big, but wide. With five billion people in the world you couldn’t pick the buggers off one by one any more; you had to spread your effort. They’d never have thought up Welsh-language television, for example. Or value-added tax. Or Manchester.

Good Omens page 18.

Honestly, I often felt like Crowley: I rarely ever worked on huge, top-to-bottom cathedral projects. But I would be sweeping around a bunch of rakes, so that newcomers wouldn’t hit them, and that all of my colleagues would be able to build stuff more quickly.

Why I like my bugs open

Even though I did read Donnie’s posts with his proposal on how to better Gentoo, I don’t really want to discuss them at the moment, since I have to admit that Gentoo politics is far from what I want to care right now, I have little time lately and that little time should not be wasted in discussing non-technical stuff, in my opinion, since what I’m best doing is technical stuff. Yet, I wanted to at least discuss one problem I see with most non-technical ideas about how to improve Gentoo: the bug count idea.

Somehow, it seems like management-kind people like to use the bug count of something as a metric to decide whether it’s good or not. I disagree with this ferociously because it really does not say much about software, if we consider as closed bugs that have just been temporarily worked around. It would be like considering the count of lines of code as a metric to evaluate the value of software. Any half-decent software engineer know that the amount of lines of code alone is pointless, and that you should really consider the language (it really changes a lot if the lines can contain one or twenty instructions each), and the comment-to-code ratio.

If the Gentoo developers were evaluted based on how many open bugs there are or on the average time a bug is kept open, we’ll end up with an huge amount of bugs closed for “need info” or “invalid” or “works for me” (which has a vague “you suck” idea), and of course “later” which is one kind of resolution I really loathe. The problem is that sometimes, to reduce the impact of a bug on users, you should put a quick workaround, like a -j1 in the emake call, or you’ll have to wait for something else to happen (for instance you might need to use built_with_use checks until the Portage version that supports EAPI 2 is stable.

Here the problem is that the standard workflow used in Bugzilla does not suite the way Gentoo works at all. And not everybody in Gentoo follows the same policy when it comes to bugs. For instance, many people would find that any problem with the upstream code does not concern Gentoo, others feel like crashes and similar problems should be taken care of, but improvements and other requests should be sent upstream. Some people think that parallel make is a priority (I’m one of them) but others don’t care. All in all, before we decide to measure the performance of developers based on bugs, we should first come over with a strict policy on how bugs are handled.

But still, open bugs mean that we know there are problems, and might or might not mean that we’re actively working on solving them, it might well be that we’re waiting for someone else to take care of them for instance. How should we deal with this?

I sincerely think that we should see to add more states to the bugs state machine. For instance we could make so that there’s a “worked around” state, which would be used for stuff like parallel make being disabled, --as-needed turned off and similar. It would mean that the bug is still there and should be resolved, but in the mean time users shouldn’t be hitting it any longer. Furthermore, we should have a way to take a bug off our radars for a while by setting it “on hold” till another bug is solved. For instance, the new Portage goes stable so we can deal with all the bugs related to EAPI 2 features being needed.

Then we should make sure that bugs that are properly resolved, and confirmed so, are closed, so that we can be sure that they won’t come up again. Of course it can’t be the same person who has marked the bug as resolved to mark it as closed, but someone else, in the same team, or the user reporting the bug, should be able to confirm that the resolution is correct and the bug is definitely closed.

But the most important thing for me is to take away the idea that bugs are a way to measure how software sucks and consider them instead a documentation of what has yet to be done. This is probably why trackers different from Bugzilla often change the name of the entries to “tasks” or “issues”; language has a high psychological impact, and we might want to deal with that ourselves.

So, how many tasks have you completed today?

Gli strumenti CASE per l’ingegneria del software

Nota 2020-09-27: questo articolo è stato inizialmente pubblicato su Linux Journal Italia nel numero di Ottobre 2006. La rivista non è più disponibile e non pare archiviata altrove, e ho quindi deciso di rendere l’articolo disponibile su questo blog, come già fatto per altri articoli scritti da me in passato. La versione pubblicata può essere scaricata dal mio sito, mentre questo post include il testo. Vi prego di essere pazienti riguardo i refusi — l’articolo originale è stato pubblicato su sei pagine, e non avendo più la copia della bozza disponibile, è stato recuperato tramite OCR.

L’ingegneria del software, la materia che si occupa di descrivere e migliorare il processo di produzione di nuovo software, descrive anche un’intera classe di software di supporto per assistere il processo di produzione e di sviluppo del software stesso.

Parliamo dei cosiddetti strumenti CASE (Computer Aided Software Engineering, ingegneria del software assistita dal computer).

Il contributo dato da questa classe di software alla qualità dei prodotti finali è spesso sottovalutato, poiché in passato lo sviluppo di questi strumenti era poco pratico o poco conveniente e il loro impiego non veniva considerato vantaggioso per diverse ragioni, a cominciare dal costo dei software disponibili (proporzionale alla difficoltà del loro sviluppo), e finendo con la quantità di spazio di immagazzinamento su memorie di massa necessario per memorizzare le informazioni.

Lo stesso sviluppo di tali strumenti era spesso troppo arduo, un singolo gruppo di sviluppatori non sarebbe mai riuscito a creare software CASE allo stesso tempo generico e dettagliato per adattarsi alle differenti necessità di altri gruppi di programmatori, sia perché aumentando la flessibilità di un programma spesso si finisce col renderlo troppo complicato per essere semplice da utilizzare, sia perché è necessaria la conoscenza dei diversi modi in cui il programma verrà utilizzato per sapere come scriverlo.

Ma come spesso accade nel mondo del software, bastano pochi anni perché un’idea che fino a poco tempo prima sarebbe stata considerata irrealizzabile diventi abbordabile, grazie ai cambiamenti che avvengono sia nel modo di sviluppare il software sia nel campo dell’hardware medio su cui il software sarà eseguito. Per questo motivo ad oggi ci sono diversi strumenti CASE che vengono diffusamente utilizzati nei processi di sviluppo del software, sia proprietario che libero, sia professionistico sia amatoriale.

Il contributo del software libero

Una delle ragioni primarie nell’espansione del software CASE è sicuramente legata al software libero. Per sua stessa natura, il software libero tende ad essere sviluppato da persone con storie personali diverse, provenienti non solo da nazioni e culture diverse (di cui si possono interessare i sociologi, ma non di sicuro sviluppatori), ma soprattutto da scuole di sviluppo e progettazione diverse. La maggior parte degli sviluppatori impara un metodo di sviluppo e di progettazione ricavandolo dai propri insegnanti, su carta o di fatto (uno sviluppatore che ha imparato a programmare sul campo da un programmatore più anziano avrà sicuramente un metodo di lavoro diverso rispetto ad uno studente uscito dall’università che ha imparato a programmare attraverso i suoi professori: il primo avrà un approccio più orientato alla funzionalità degli strumenti con cui sviluppa, mentre lo studente avrà un approccio più formale, considerando che la maggior parte dei professori utilizzano tale approccio, che quindi valuterà gli strumenti inizialmente per quello che gli viene detto vengono utilizzati).

A causa di queste differenze, e spesso necessario uniformare il metodo di lavoro all’interno di un progetto di software libero, e questo avviene solitamente tramite l’utilizzo di strumenti CASE che aiutino a standardizzare il processo d’insieme.

Un altro contributo dato dal software libero allo sviluppo degli strumenti CASE è relativo direttamente al lato tecnico dello sviluppo del software libero stesso. Come si è dichiarato in precedenza, difficilmente un gruppo limitato di sviluppatori riuscirebbe a sviluppare un software che si applichi a casi molto generali: lo sviluppo di un tale software richiederebbe molti mesi di analisi, interviste e prove pratiche per poter raccogliere dati sul metodo di utilizzo dell’applicativo, ma questo solitamente è un costo troppo alto per un’azienda che volesse vendere il proprio prodotto, poiché ai requisiti di risorse non corrisponde una certezza di introito, il software sviluppato potrebbe molto facilmente essere troppo generico per essere utilizzato in modo pratico, oppure si potrebbe finire con il crearlo troppo legato al processo di produzione delle grandi aziende, troppo complesso e intricato per i medi-piccoli sviluppatori che formerebbero il maggiore pubblico per un tale software; una soluzione e quella di sviluppare un software generico da modificare per il cliente di volta in volta, ma questo potrebbe essere troppo oneroso per i piccoli sviluppatori, e coloro che potessero permettersi una tale spesa potrebbero, molto probabilmente, sviluppare il software di supporto in casa.

Grazie agli ideali del software libero, è più semplice iniziare a sviluppare un software tagliato per un determinato scopo, sviluppato appositamente per un gruppo di sviluppatori, e aspettarsi che altri gruppi di programmatori estendano, astraggano o dettaglino tale software, riducendo la necessità di analisi e interviste. Inoltre lo sviluppo di soluzioni su misura diviene più pratico, poiché non e più necessario chiederne la modifica ai creatori originali, ma chiunque può modificare il software per come è necessario, dedicando un numero limitato di sviluppatori interni che già conoscono il flusso del processo nel proprio ambito.

Ma seppure il software libero ha una primaria importanza nella diffusione degli strumenti CASE, e difatti molti di tali strumenti sono disponibili come software libero, non si tratta sicuramente della sola ragione per cui questi strumenti sono ora più diffusi che in passato.

Molti strumenti CASE memorizzano informazioni durante tutte le fasi di sviluppo di un programma, quindi la quantità di spazio su memorie di massa di cui necessitano cresce con la crescita delle dimensioni del software (e di fatto un software solitamente cresce di dimensioni durante la sua vita, e mai diminuisce). Anche solo cinque anni fa era un sogno quasi impossibile avere un computer con due dischi rigidi da 240 GiB, mentre ora si tratta di una realtà molto comune; anche se i costi dei dischi rigidi professionali per server e workstation aziendali non sono scesi con la stessa curva di quelli per normali PC, si ha comunque un prezzo per MiB ben più basso del passato, che rende possibile I’immagazzinamento delle grosse quantità di dati generati dagli strumenti CASE.

Progettare, realizzare, mantenere

Esistono strumenti CASE per assistere a tutte le fasi del processo di produzione del software: alcuni sono pensati e realizzati per assistere i progettisti che lavorino prima dell’inizio dello sviluppo vero e proprio del software, altri sono pensati per aiutare la gestione del software dal momento in cui si inizia la sua programmazione per tutta la sua estensione di vita, mentre altri assistono il mantenimento sul medio e lungo termine.

Ci sono poi molti software che non si considerano spesso come strumenti CASE, come i compilatori, perché sono parte integrante del processo di sviluppo del software, o perché sono stati sviluppati prima che la definizione di CASE fosse coniata. Si tratta perlopiù di “assistenti” allo sviluppo, come estensioni o miglioramenti del compilatore o strumenti di debug. La loro presenza nella categoria degli strumenti CASE può essere messa in discussione, ma non ci sono dubbi che, facendo parte del processo di sviluppo del software, siano anche parte della più grande catena dell’ingegneria del software.

Ovviamente ci sono casi in cui, per supportare la produzione di un particolare software, si vanno ad utilizzare strumenti che non sono stati pensati e sviluppati per tale scopo. Anche se questi fanno parte di determinati processi di sviluppo, di un’azienda o di un gruppo di programmatori, non è ovviamente possibile considerarli strumenti CASE, nello stesso modo in cui un browser Web utilizzato per la ricerca di informazioni non possa essere considerato strumento di un’inchiesta giornalistica.

C’e poi da dire c e la maggior parte di questi strumenti non ha un risultato ultimo, perché quello di cui si occupano è fornire un supporto durante tutto il processo di sviluppo. Quando gli strumenti generano informazioni direttamente, si tratta solitamente di informazioni estemporanee che vengono utilizzate e rese già obsolete dagli sviluppatori. Gli strumenti più importanti, però, creano soltanto una grossa quantità di informazioni che continuano a venire utilizzate da quegli strumenti stessi, e da cui e comunque possibile recuperare dati più intelligibili (seppur spesso con dubbia utility) che possono essere utilizzati per creare statistiche e grafici utili nelle presentazioni.

Supporto alla progettazione

A seconda del metodo di sviluppo che si sta seguendo, la progettazione iniziale, prima di iniziare lo sviluppo di un software, ha una minore o maggiore importanza; in ogni caso, anche nei metodi di sviluppo agile e di programmazione estrema (XP, eXtreme Programming), in cui i programmatori si concentrano soprattutto sulle fasi di sviluppo e di test incrementale del software, una delle prime fasi del processo di sviluppo del software e appunto la progettazione (preceduta dalla raccolta di informazioni riguardo al software che si sta sviluppando).

Per riuscire a scrivere strumenti di supporto funzionali, utili e comprensibili da un vasto pubblico per la progettazione del software, è stato definito un linguaggio che può essere utilizzato per descrivere la struttura di un programma secondo una serie di aspetti, si tratta di UML (Unified Modelling Language, linguaggio di modellazione unificato). UML descrive come si possono realizzare una serie di diagrammi che a loro volta descrivono il software che si sta progettando. I diagrammi più comuni sono sicuramente quello delle classi (che descrive i legami di ereditarietà tra le classi e i loro membri), quello di attività (una versione modificata del classico diagramma di flusso (flow chart) pensata per descrivere il flusso delle funzioni di un programma) e quello dei casi d’uso (use-case) (che permette di definire il modo in cui diverse persone utilizzano un dato software).

Per quanto sia logicamente possibile realizzare i diagrammi UML con un foglio e una matita, o con un generico software di grafica, uno strumento CASE offre più di queste semplici funzionalità.

I software di progettazione UML, come Umbrello e bouml (come software libero) o Poseidon (proprietario). Questi programmi non sono solo programmi di grafica specializzati, ma forniscono supporto aggiuntivo, come la possibilità di definire una singola classe una sola volta e riutilizzarla nei vari diagrammi, delle classi e di struttura, correlandole a livello logico e grafico. Oltre a questo, quasi tutti i software di progettazione UML hanno una funzione, più o meno elaborata, per la generazione di codice a partire dai diagrammi disegnati; si tratta solitamente solo di uno scheletro delle classi definite nello schema, con i nomi e le firme delle funzioni, senza la loro definizione.

Per quanto nessun software e ancora abbastanza sofisticato dal riuscire a generare del codice che funzioni senza ritocchi qua e la, poiche i diagrammi UML sono solo delle descrizioni della struttura, dell’uso o del flusso dei dati, ma per esempio gli header con le dichiarazioni delle classi derivate dal diagramma relativo sono un buon punto di partenza per la loro implementazione, e un riferimento per gli utilizzatori di tali classi che possono iniziare lo sviluppo di altri componenti del software in parallelo, senza dover attendere la completa realizzazione delle loro dipendenze.

Supporto allo sviluppo

Quasi nessun software viene scritto immediatamente nella sua forma ultima, ma tende invece ad evolvere a partire da uno scheletro (o dal nulla) fino a diventare pronto per il rilascio. Molto spesso, poi, il codice sorgente del software rilasciato non rimane fermo, ma si evolve anche dopo il rilascio per la creazione di versioni successive, per la correzione di bug e così via.

Per questa ragione con il termine “sviluppo” non ci si vuole solo riferire alla fase di lavorazione che si pone tra la progettazione e la convalida, ma a tutte quelle fasi che partono nel momento in cui si inizia a lavorare al codice sorgente, comprese quelle necessarie per la correzione degli errori individuati nella fase di test e quelle necessarie per l’aggiornamento del software per adattarsi alle richieste dei committenti durante tutto l’arco di vita del suo supporto. E quindi di primaria importanza conoscere quali modifiche sono state effettuate nel tempo, il loro motivo e il loro autore.

Per questa ragione, uno dei più importanti strumenti di ingegneria assistita e sicuramente il software per la gestione del codice sorgente (SCM, Source Code Management), ovvero quel software che permette di salvare e visionare la storia di un insieme di file contenenti codice sorgente (o altro) memorizzando tutte le modifiche che avvengono tra una versione e l’altra dei file. Un simile risultato si ottiene utilizzando un metodo non-CASE, ovvero il backup periodico dei sorgenti (giornaliero, orario, o nel momento del rilascio del software), ma tramite l’uso di un SCM e solitamente possibile memorizzare assieme alle vane modifiche anche un commento relativo alle modifiche effettuate, e a seconda di quanto sofisticato sia il software SCM utilizzato e possibile riprendere i sorgenti come stavano ad un certo punto del tempo e iniziare uno sviluppo parallelo a partire da questi.

Il software SCM libero più conosciuto ed utilizzato sicuramente CVS (Concurrent Versioning System, sistema di versionamento concorrente), basato sul più datato RCS (Revision Control System, sistema di controllo delle revisioni), che espande con il supporto per la gestione contemporanea di pia file, l’accesso da parte di utenti multipli e tramite rete. CVS e stato sicuramente uno dei punti chiave nello sviluppo comunitario del software libero; grazie alla sua disponibilità e relativa semplicità di impostazione, e uno dei software più utilizzati per la gestione dei sorgenti di software libero e Open Source, specialmente perché negli anni sono nati e fioriti diversi servizi che forniscono agli autori di tale software spazio web e altri strumenti per la gestione dei progetti, tra cui anche SCM (SourceForge.net, BerliOS, GNU Savannah, …). Purtroppo l’età di CVS si sta dimostrando un grosso punto a sfavore, mancando molte delle funzioni disponibili in altri software, come Subversion, tra cui la gestione dei cambiamenti su file diversi come una singola modifica.

Esiste poi una serie di software SCM “distribuiti”, che rimuovono la necessità di un singolo server a cui accedere per inviare le modifiche, e invece le spostano sui diversi computer su cui queste avvengono (e, volendo, su un server pseudo-centralizzato).

Anche se tutti (o quasi) gli SCM permettono di visualizzare le modifiche tra una versione e l’altra e recuperare le informazioni relative tramite il software stesso, solitamente tali informazioni vengono recuperate e visualizzate tramite più amichevoli interfacce web, come ViewVC, WebCVS, WebSVN, gitweb e così via. Questi software sono strettamente legati e dipendenti dal software SCM stesso, anche se sono solitamente sviluppati separatamente e da gruppi separati di sviluppatori.

Durante lo sviluppo, poi, e solitamente necessario avere a disposizione la documentazione delle interfacce interne ed esterne del software, (quelle esterne sono necessarie anche una volta che il software è stato completato), se esistono. Per quanto sia possibile scrivere tale documentazione separatamente, per facilitare questo compito sono stati sviluppati diversi software che interpretano i sorgenti ed estraggono i commenti lì presenti per generare una documentazione di riferimento, se formattati in modo opportuno. Solitamente questi software generano la documentazione in formati diversi, HTML tra questi in genere, ma possono anche generarla in un formato utilizzabile da un altro programma per fornire informazioni contestuali nel codice sorgente.

Purtroppo realizzare un software unico che possa utilizzarsi per documentare qualsiasi linguaggio e molto difficile, non solo perche non esiste un interprete universale che si adatti a tutte le regole di sintassi dei vari linguaggi, ma anche perche ogni linguaggio ha caratteristiche diverse (sintassi di chiamata, strutture dati, . . . ) che devono essere tenute in considerazione nella creazione della documentazione. Per questa ragione esistono diversi software che generano la documentazione a partire da sorgenti di diversi linguaggi.

Tra i software liberi disponibili e da segnalare Doxygen, che utilizza una sintassi compatibile a quella di JavaDoc e al software proprietario utilizzato per documentare le librerie Qt di Trolltech, limitato a linguaggi più o meno compatibili con C, C++ e Java, che ha la possibilità di generare diagrammi delle classi UML basandosi sul codice effettivamente scritto, permettendo un rapido-confronto tra la progettazione e la realizzazione effettiva.

Oltre a questi software di supporto strettamente CASE, ci sono altri strumenti che si possono considerare parte dei processi CASE anche se non vengono utilizzati solamente nel contesto dell’ingegneria del software, e anzi sono, per gli sviluppatori che li utilizzano, parte stessa dello sviluppo.

Si tratta di quegli strumenti utilizzati per il debug, ovvero per la ricerca e correzione degli errori di programmazione all’interno di un prodotto in via di sviluppo.

La fase di debug non pub essere saltata da nessuno sviluppatore, professionista o amatoriale, singolo o in gruppo, di software libero o proprietario, qualsiasi sia il metodo di sviluppo utilizzato.

Si tratta di parte integrante dello sviluppo del software e per questo e sicuramente l’attivita per la quale sono stati scritti più software di supporto.

II primo software importante per questa attività e il cosiddetto “debugger dinamico”, utilizzato per verificare lo stato di un processo in esecuzione determinato sul codice sorgente. Questo tipo di strumento e altamente dipendente dal linguaggio di programmazione utilizzato, dal compilatore o interprete scelto, e dal sistema operativo. Per quanto le funzioni principali siano molto chiare (fermare l’esecuzione, analizzare le variabili, eseguire passo-passo), con il tempo i vari debugger hanno iniziato a fornire funzioni avanzate, come interfacce grafiche o l’accesso remoto tramite seriale o TCP/IP.

Il software libero più utilizzato per questo compito è sicuramente GNU Debugger (gdb).

Oltre a questo, esistono due principali classi di strumenti che sono pensati per supportare tale attività, si tratta degli strumenti per l’analisi statica e l’analisi dinamica.

I primi sono utilizzati per analizzare i sorgenti del software durante lo sviluppo, alla ricerca di chiamate errate o problematiche, per evitare errori comuni, mentre gli strumenti di analisi dinamica sono utilizzati sul software in esecuzione per verificarne il comportamento, per cercare eventuali errori nella gestione della memoria o nell’utilizzo dei parametri forniti.

Mentre l’analisi statica viene eseguita sull’intero codice, l’analisi dinamica pub essere eseguita solo sulle parti di codice che sono realmente eseguite mentre lo strumento e attivo, quindi per quanto solitamente più sofisticata, non può considerarsi un metodo sicuro per la verifica di un intero software.

Bisogna però mettere le mani avanti quando si parla di questi strumenti, facendo una netta distinzione tra i linguaggi compilati e quelli interpretati, ovvero tra i linguaggi come C, C++ e Java (per quanto quest’ultimo sia parzialmente interpretato) e i linguaggi di scripting come Perl, Python, Ruby e PHP, utilizzati professionalmente soprattutto nella creazione di Web Applications (Pert, PHP e Ruby in particolare). Mentre per i primi sono disponibili molti software di analisi

sia statica che dinamica, per gli ultimi la cosa si fa molto più complessa, poiche non è solitamente possibile effettuare un’analisi statica di uno script interpretato, perche le variabili possono assumere valori e tipi di dato che cambiano al momento dell’esecuzione, e per l’analisi dinamica e difficile separare l’esecuzione degli script dall’esecuzione dell’interprete e della chiamata alle funzioni di libreria necessarie.

L’analisi dell’esecuzione dei software scritti utilizzando linguaggi interpretati, quindi, deve essere effettuata dall’intemo del programma stesso, solitamente senza software che la possa assistere.

Gli strumenti di analisi statica hanno avuto un ampio sviluppo dagli albori della programmazione moderna fino ad oggi, il software più conosciuto e sicuramente LINT, sviluppato appunto negli anni ’70 quando i compilatori C non effettuavano un rigido controllo sul tipo dei parametri passati alle funzioni, permettendo dunque di chiamare le funzioni di libreria con parametri diversi da quelli attesi (interi per stringhe e viceversa per esempio).

Si tratta però di un software ormai obsoleto, poiché tutti i compilatori moderni già fanno tale controllo, e buona parte dei controlli statici che venivano effettuati da LINT. Il compilatore diventa dunque anche un importante strumento di ingegneria del software che, correttamente utilizzato, permette di migliorare la qualità del software sviluppato.

Mentre il software di analisi statica per un dato linguaggio di programmazione sovente funziona su qualsiasi sistema operativo, salvo i normali problemi di portabilità del software, il software di analisi dinamica dipende anche molto dal sistema operativo (e dall’architettura hardware), si tratta quindi di software particolare, che net software libero e rappresentato da Valgrind (per GNU/Linux) e da DTrace (per Solaris e FreeBSD). Ma anche le librerie di sistema moderne hanno alcune funzioni di analisi che possono indicare se il programma sta comportandosi in modo corretto con la memoria.

Supporto al test e alla convalida

La fase di test e di convalida e spesso sottovalutata sia da programmatori amatoriali sia da professionisti; si tratta della fase in cui si va a controllare che il software sviluppato sia conforme ai desideri del cliente (o in generale degli utenti del software stesso). L’uso di software CASE per questa fase ha un’importanza maggiore che per le altre: non e solo un ausilio che semplifica il lavoro degli sviluppatori, ma può anche essere l’unico modo per avere dei risultati affidabili riguardo i test: se alcuni test fondamentali sono sempre effettuati manualmente non e impossibile ne raro che il test sia effettuato in modo leggermente diverso di volta in volta, rendendo i risultati inattendibili o addirittura inutili. Per questa ragione e importante che i controlli siano automatizzati il più possibile, e per questo sono stati sviluppati diversi software CASE per migliorare l’affidabilità dei controlli. Si deve pero sapere che più un controllo e complesso e avanzato, più è probabile che individui dei falsi positivi per codice scritto in modo poco ortodosso o comunque non consono.

Un esempio di software CASE per il supporto della fase di test si trova in quelle soluzioni per la creazione dei cosiddetti (test unit, unita di test ), come per esempio JUnit e C++Unit, che permettono di semplificare la scrittura di test in modo che possano eseguirsi in modo analogo anche se vanno a testare parti differenti del software. Le unita di test poi possono comporsi in “batterie di test di regressione”, che se eseguiti regolarmente dopo modifiche importanti ed estensive permettono di riconoscere subito errori nelle modifiche che possono essere più difficili da individuare successivamente.

Ma se i test di regressione sono pin parte dell’attività di debug che della fase di convalida, si possono scrivere unita di test pin complesse che verificano il comportamento delle interfacce esterne del software, piuttosto che singole funzioni o singoli componenti interni del software. In questo caso si tratta di un valido strumento CASE per la convalida, anche se molto spesso le funzionalità da convalidare richiedono l’intervento di un utente reale, specie per i software con interfaccia grafica complessa.

Supporto alla manutenzione

C’e un’altra classe di strumenti CASE che deve molto, per il suo sviluppo, al software libero: si tratta dei cosiddetti bug trackers, quei software, utilizzati sia da progetti di software libero che da aziende sviluppanti software di qualsiasi tipo (ma anche da alcune aziende che vendono servizi), che permettono di segnalare errori e richieste di funzionalità per un determinato software, di commentare sulle richieste, e di indicare la risoluzione, il risolutore e altre informazioni relative alle richieste. Solitamente tali software sono delle Web Applications, per permetterne un semplice accesso da parte di tutti gli utenti o clienti, anche se non mancano alcuni software che permettono di inviare le richieste ad un server centrale tramite particolari protocolli.

Il software libero più utilizzato per questo genere di supporto e sicuramente Bugzilla, sviluppato dal progetto Mozilla (autori del browser Firefox e del client di posta Thunderbird) per la gestione dei propri bug, ma successivamente utilizzato da molti altri progetti, grossi e non, come KDE, GNOME, FreeDesktop, Novell, Gentoo e altri ancora. Essendo software libero, pur non essendo di semplice installazione e gestione, quasi ogni installazione di Bugzilla e differente dalle altre; oltre alle modifiche alla veste grafica, che sono la parte pin semplice della personalizzazione, alcune installazioni permettono la segnalazione guidata dei bug, utilizzando pagine diverse da quella originale, che facilitano la categorizzazione delle richieste.

E anche vero che moltissimi progetti non utilizzano altri tracker a parte quelli forniti dal loro servizio di hosting (SourceForge, BerliOS e così via), poiché non possono permettersi un server dedicato per Bugzilla (che richiede molte risorse per funzionare adeguatamente). A causa delle richieste tecniche necessarie per far funzionare Bugzilla, e la sua complessa gestione, anche quei progetti che non utilizzano servizi di hosting integrati cercano quando possibile soluzioni alternative per la gestione delle richieste. Un altro software molto utilizzato e Mantis, di pin basso profilo, meno configurabile, ma molto pin leggero, non avendo particolari richieste in risorse o in dipendenze.

Similmente esiste un software chiamato Trac, che unifica in una singola applicazione un Wiki, un’interfaccia Web al software SCM Subversion e un sistema di tickets (equivalente ad un bug tracker). Per quanto meno elaborato di Bugzilla e Mantis, Trac può essere facilmente utilizzato per la gestione dei bug di progetti di media dimensione. Si tratta comunque non di un semplice software per la gestione dei bug ma piuttosto di un cosiddetto “ambiente CASE integrato”, ovvero di un software che permette di accedere, attraverso la stessa interfaccia, alle informazioni relative alle richieste, alla documentazione e alla storia dei sorgenti.

Ambienti integrati

Come si e visto in precedenza, esistono i cosiddetti “ambienti CASE integrati”, (integrated CASE environments), ovvero software che permette l’accesso, tramite una singola interfaccia, a diversi strumenti CASE. Un esempio e Trac, software che fornisce un Wiki, un’interfaccia a Subversion come software SCM, un sistema di ticket e anche un sistema di release dei file. Similmente i software utilizzati da SourceForge e da GForce e GNU Savannah (i primi due proprietari, il terzo software libero), forniscono diversi strumenti utili per I’ingegneria del software (gestione delle release, gestione dei bug e delle richieste, gestione della documentazione e così via), sotto un’unica interfaccia permettendo la collaborazione e la coordinazione tra sviluppatori.

Ma se la maggior parte degli ambienti CASE integrati sono Web Applications, per rendere più semplice la collaborazione tra diversi sviluppatori, esistono anche ambienti integrati di sviluppo (IDE, Integrated Development Environment) che forniscono un’interfaccia unica a strumenti CASE differenti, come possono essere i software per la scrittura dei sorgenti (editor di testo), i compilatori e gli strumenti SCM.

I software pin conosciuti per questi compiti sono sicuramente i software proprietari di Borland e Microsoft, che hanno iniziato lo sviluppo di tali ambienti, ma Eclipse di IBM e KDevelop del progetto KDE, entrambi esempi di software libero, stanno guadagnandosi un buon pubblico.

La maggior parte degli ambienti di sviluppo integrati forniscono funzionalità di supporto strettamente per lo sviluppo, e già il supporto per il software SCM e una funzionalità avanzata, non disponibile in tutti gli ambienti e non soprattutto non per tutti i software che possono utilizzarsi.

Quello che invece manca nella maggior parte degli ambienti di sviluppo e l’integrazione degli strumenti di supporto alla progettazione.

Non e dunque possibile avere un singolo ambiente CASE integrato, seppure lo sviluppo del software CASE e ora molto più avanzato che negli scorsi anni. Anche le diverse Web Applications che forniscono supporto CASE non sono integrate tra loro, seppure l’esteso uso di XML riesce a permettere una generica integrazione di tante applicazioni Web, nulla e stato sviluppato per permettere l’integrazione dei vari software per la gestione dei progetti sulla falsariga di SourceForge.

Per ciò che riguarda l’integrazione dei bug trackers, per quanto Canonical (l’azienda che finanzia to sviluppo della distribuzione Ubuntu) abbia in progetto lo sviluppo di un software, che si appoggi al loro già sviluppato LaunchPad, per l’integrazione di vari sistemi di tracking, nulla è ancora sviluppato che permetta semplicemente di collegare bug in diversi siti, ma Bugzilla già fornisce la possibilità di esportare i dati relativi ai bug tramite XML, il che fornisce uno spunto per l’utilizzo dei dati di un sito su un altro.

Tramite il giusto uso degli strumenti CASE elencati durante le vane fasi che compongono il processo di produzione del software e possibile far evolvere un software dalla sua origine fino ad un prodotto finale senza perdite di informazioni e senza richiedere la produzione di molta documentazione cartacea. La progettazione, la documentazione degli algoritmi, la documentazione delle interfacce, la storia del codice, degli errori e delle richieste segnalati saranno disponibili e, se viene usato un ambiente integrato, saranno pieni di riferimenti incrociati.

E dunque possibile, per esempio, sapere la ragione per cui e stata effettuata una modifica, perché il comportamento del software e stato cambiato in risposta alle richieste dei suoi utenti e così via; e possibile recuperare la documentazione prodotta durante la fase di progettazione e confrontarla con il codice che e stato realizzato; e possibile sviluppare il software in parallelo per affrontare problemi diversi, facendo evolvere due prodotti distinti a partire da una base di codice comune.

Purtroppo, il software di supporto CASE none ancora abbastanza sviluppato per conformarsi adeguatamente ad alcuni metodi di sviluppo che si discostano troppo dai metodi generali che l’ingegneria del software ha tracciato, perché per esempio c’e software che non possiede alcuna fase di progettazione complessa e semplicemente si evolve durante la fase di realizzazione partendo da una base molto semplice e ascoltando le richieste degli utenti; in tal caso il supporto CASE importante e quello relativo al software SCM e ai bug trackers, mentre il software di progettazione non avrebbe più ragione per essere usato.

Si tratta quindi di un campo di sviluppo dell’ingegneria del software ancora aperto all’innovazione, e in cui ogni sviluppatore ha la possibilità di portare nuove idee e nuove implementazioni, grazie alla grande quantità di software libero già disponibile cui contribuire. Software scritto da sviluppatori per gli sviluppatori.