Computer-Aided Software Engineering

Fourteen years ago, fresh of translating Ian Sommerville’s Software Engineering (no, don’t buy it, I don’t find it worth it), and approaching the FLOSS community for the first time, I wrote a long article for the Italian edition of Linux Journal on Computer-Aided Software Engineering (CASE) tools. Recently, I’ve decided to post that article on the blog, since the original publisher is gone, and I thought it would be useful to just have it around. And because the OCR is not really reliable, I ended up having to retype a good chunk of it.

And that reminded me of how, despite me having been wrong a lot of times before, I still think some ideas stuck with me and I still find them valid. CASE is one of those, even though a lot of times we’re not really talking of the tools involved as CASE.

UML is the usual example of a CASE tool — it confuses a lot of people because the “language” part suggests it’s actually used to write programs, but that’s not what it is for: it is a way to represent similar concepts in similar ways, without having to re-explain the same iconography: sequence diagrams, component diagrams, entity-relationship diagrams standardise the way you express certain relationship and information. That’s what it is all about — and while you could draw all of those diagrams without any specific tool, with either LibreOffice Draw, or Inkscape, or Visio, specific tools for UML are meant to help (aid) you with the task.

My personal preferred tool for UML is Visual Paradigm, which is a closed-source, proprietary solution — I have not found a good open source toolkit that could replace it. PlantUML is an interesting option, but it doesn’t have nearly all the aid that I would expect form an actual UML CASE tool — you can’t properly express relationships between different components across diagrams, as you don’t have a library of components and models.

But setting UML aside, there’s a lot more that should fit into the CASE definition. Tools for code validation and review, which are some of my favourite things ever, are also aids to software engineering. And so are linters, formatters, and sanitizers. It’s easy to just call them “dev tools”, but I would argue that particularly when it comes to automating the code workflows, it makes sense to consider them CASE tools, and reduce the stigma attached to the concept of CASE, particularly in the more “trendy” startups and open source — where I still feel push backs at using UML, or auto-formatters, and integrated development environments.

Indeed, for most of these tools, they are already considered their own category: “developer productivity”. Which is not wrong, but it does reduce significantly the impact they have — it’s not just about developers, or coders. I like to say that Software Engineering is a teamwork practice, and not everybody on a Software Engineering team would be a coder — or a software engineer, even.

A proper repository of documents, kept up to date with the implementation, is not just useful for the developers that come later, and need to implement features that integrate with the already existing system. It’s useful for the SRE/Ops folks who are debugging something on fire, and are looking at the interaction between different components. It’s useful to the customer support folks who are being asked why only a particular type of requests are failing in one of the backends. It’s useful to the product managers to have clear which use cases are implemented for the service, and which components are involved in specific user journeys.

And similarly it extends for other type of tools — A code review tool that can enforce updates to the documentation. A dependency tracking system that can match known vulnerabilities. A documentation repository that allows full reviews. An issue tracker system that can identify who most recently changed code that affects the component an issue was filed on.

And from here you can see why I’m sceptical about single-issue tools being “good enough”. Without integration, these tools are only as useful as the time they save, and often that means they are “negative useful” — it takes time to set up the tools, to remember to run them, and to address their concern. Integrated tools instead can provide additional benefits that go beyond their immediate features.

Take a linter as an example: a good linter with low false positive rate is a great tool to make sure your code is well written. But if you have to manually run it, it’s likely that, in a collaborative project, only a few people will be running it after each change, slowing them down, while not making much of a difference for everyone else. It gets easier if the linter is integrated in the editor (or IDE), and even easier if it’s also integrated as part of code review – so those who are not using the same editor can still be advised by it – and it’s much better if it’s integrated with something like pre-commit to make it so the issues are fixed before the review is sent out.

And looking at all these pieces together, the integrations, and the user journeys, that is itself Software Engineering. FLOSS developers in general appears to have built a lot of components and tools that would allow building those integrations, but until recently I would have said that there’s been no real progress in making it proper software engineering. Nowadays, I’m happy to see that there is some progress, even as simple as EditorConfig, to avoid having to fight over which editors to support in a repository, and which ones not to.

Hopefully this type of tooling is not going to be relegated to textbooks in the future, and we’ll also be used to have a bunch of CASE tools in our toolbox, to make software… better.

On filesystem wiping

I have been talking about BIOS with “facebook friends” today, and Davide (from KDE) asked me if I could send him a copy of my FreeDOS stick, or at least the instructions on how to do so which obviously were on this very blog.

At first I wanted to just send him a compressed copy of the stick itself; then I remembered that I’m not that stupid and I checked out the content of the actual image. The problem is that even if I remove the files as they appear on the filesystem, FAT does not clear them out until the space is needed again; given that my USB stick has been used on almost every imaginable operating system (Windows XP, Vista, 7, OS X Snow Leopard and Lion, and of course Linux in a number of variants I don’t want to start thinking about), and that if it held a hundred megabytes it was very strange (the stick itself is 2GB in size, I didn’t have anything smaller at the time, nor I have now), you can easily tell that the space was rarely reclaimed.

This made it a problem in two very different ways: from one side, the compression couldn’t be very good, as the deleted files were still there and would account for hundreds of megabytes at least, as I’ve been using the same key for almost an year, and I had many boxes to update the BIOS of in the mean time, both mine and customers’; on the other side, some of the data in that image would include serial numbers for both Windows and Avira licenses I installed on customers’ systems (many times I only used one stick to bring the data in and out from a system: that stick).

In my mind the task to perform was clear: wipe out the empty areas of the filesystem so that they are all zero’d: saves on the compression and makes sure that the data is not going to be leaked. So I checked dosfstools and … no luck, there is no utility to do this. I’m not surprised, I don’t have a particular liking for dosfstools: back when it was picked up by Debian, I decided to help out and I worked for a few days on the buildsystem and code to update and improve it … after a quick note of “next release, before we want to just push in the changes that everybody applied”, it seems to simply have been ignored. I didn’t care much about working more on that project so I left it behind.

Googling around I found this project on sourceforge that seemed to be designed to do exactly what I needed. If you look at the page, and you’re used to both FLOSS projects and fake projects, you probably are wondering now if it’s legit or not. I asked that to myself as well, but I gave it a try in a VM, and as far as I can see, there is nothing malicious in it as it is, at least for what vfat is concerned; I explicitly didn’t want to try other filesystems.

Aside; if you are not wondering on the legitimacy of the project I linked, and don’t understand why I would say that… well it’s not really your average FLOSS project that boast “100% free/clean” certifications from websites such as GearDownload and Softpedia… being FLOSS should suffice.

Unfortunately, while the project does exactly what it’s meant to be.. saying that it does so sub-optimally is saying too little. There are quite a few issues with the project that makes me thing I’d be better off rewriting it from scratch than trying to get it fixed – and I’m not referring ot the website, as much as I have an opinion about it, I don’t think it’s my place to judge that, after all my own website is not really updated in … too long a time – starting from the use of a library that, albeit interesting, doesn’t seem to even have a single release, using even the private headers!

Okay I shouldn’t be ungrateful about this, since it did save my day, but it makes me cringe to think I can’t package it for others. And the fact that if you were to try running the tool on a file image of a 2GB USB stick, it would probably take a day or two to complete. What’s the matter? The library it uses, ttf-lib, uses fsync() over the device – a loop device in my case – each time it writes a sector… a 512bytes sector. You can easily tell it’s not the smartest move to do. I was quite bad already that it used write() over 512 sectors one at a time, but if it also synced the device, the tool becomes so slow it’s not funny! Couple of comment markers in the code to avoid the fsync, and the tool takes just a few moments. Of course if it actually waited for a discontinuity before writing to disk, it would probably work much better.

There are a few more issues with these two projects, but at least they seem to have a decent start. I should probably contact the authors and point them in the right direction for them to be packaged in Gentoo and other distributions.

Oh and as a final note, once compressed with xz -9e, the wiped image is just 384KiB, while the unwiped image is 610MiB. Yes that’s almost the side of a CD-ROM, after compression…. the small one can be downloaded here for those of you who want to use it. Sooner or later I’ll have to make a version for 1GiB sticks, I think my mother still has one or two of those I should be able to use, rather than waste a 2GiB stick that way.

Flashing a DVD Burner

Talking the other day with Caster about his problems with his own DVD burner, I’ve started wondering about that procedure; to be honest I never thought about it because with my current NEC burner I had no problem at all.

While most modern mainboards allows BIOS update without any Windows or DOS system, not even a boot diskette, by using a subsystem inside the BIOS itself and reading the upgrade image from a floppy or a CD (in my case the latter because I don’t have any floppy disk drive on Enterprise), almost every burner, as far as I can see, requires to run some program under Windows.

For my own model, I found an unofficial tool for Linux x86, proprietary and closed source, designed to load modified firmware images, in a notorious site for CD crackers (reason why I won’t link it here right away); unfortunately you actually has to trust the program not to do anything evil, as the page actually suggest you to run it as root. It also requires a binary image of the firmware to upgrade to, and NEC only provides the upgrading tool with the firmware inline.

What I was thinking was to find exactly how the cdfreaks tool work, and then reimplement it, eventually on top of libcdio, so that we had a FLOSS tool to upgrade NEC drives that works on any platform, not only x86. Unfortunately the license of that tool disallows me from decompiling or disassembling it, although it doesn’t speak about dynamic tracing.

Through strace I discovered that the tool uses ioctl() calls to send commands to the device to dump the firmware it has loaded; the problem is to understand those calls: there doesn’t seem to be any way to trace those calls in detail, and as the tool is statically linked, it’s not possible to just preload a library to hook those calls out.

There is one more problem: reverse engineering the tool will be a risk for the hardware, bricking the device is a more than likely result. So I tried the unlikely way out first and sent a mail to NEC asking them the specs of the device upgrade procedure. I don’t count on receiving an answer, kinda obviously.

If other users have NEC burners, and think implementing burners’ firmware upgrades with FLOSS is important, let me know, I’ll use that to consider the usefulness of such a task.

And for those wondering, I’m currently attaching labels to envelopes, that’s why I’m not developing anything particular. I have plans though, and I’ll try to follow them more closely in the next days, and blog about them more (to allow myself to blog more easily and on spare time, I’m currently using my E61 to email the entry to myself and post it as soon as I reach one of my terminals).