Are tutorials to blame for basic IT problems?

Flameeyes

9 years ago

It’s now effectively impossible to spend a month following IT (and not just) new and not hear of breaches, “hacks”, or general security fiascos. Some of these are tracked down to very basic mistakes in configuration or coding of software, including the lack of hashing of passwords in database. Everyone in the industry, including me, have at some point expressed the importance of proper QA and testing, and budgeting for them in the development process. But what if the problem is much higher up the chain?

Falsehoods Programmers Believe About Names is now over seven years old, and yet my just barely complicated full name (first name with a space in it, surname with an accent) can’t be easily used by most of the services I routinely use. Ireland was particularly interesting, as most services would support characters in the “Latin extended” alphabet, due to the Irish language use of ó, but they wouldn’t accept my surname, which uses ò — this not a first, I had trouble getting invoices from French companies before because they only know about ó as a character.

In a related, but not directly connected topic, there are the problems an acquaintance of mine keeps stumbling across. They don’t want service providers to attach a title to their account, but it looks like most of the developers that implement account handling don’t actually think about this option at all, and make it hard to not set a honorific at all. In particular, it appears not only UIs tend to include a mandatory drop-down list of titles, but the database schema (or whichever other model is used to store the information) also provides the title as an enumeration within a list — that is apparent by the way my acquaintance has had their account reverted to a “default” value, likely the “zeroth” one in the enumeration.

And since most systems don’t end up using British Airways’s honorific list but are rather limited to the “usual” ones, that appears to be “Ms” more often than not, as it sorts (lexicographically) before the others. I have had that happen to me a couple of times too, as I don’t usually file the “title” field on paper forms (I never seen much of a point of it), and I guess somewhere in the pipeline a model really expects a person to have a title.

All of this has me wondering, oh-so-many times, why most systems appear to want to store a name in separate entries for first and last name (or variation thereof), and why they insist on having a honorific title that is “one of the list” rather than a freeform (which would accept the empty string as a valid value). My theory on this is that it’s the fault of the training, or of the documentation. Multiple tutorials I have read, and even followed, over the years defined a model for a “person” – whether it is an user, customer, or any other entity related to the service itself – and many of these use the most basic identifying information about a person as fields to show how the model works, which give you “name”, “surname”, and “title” fields. Bonus points to use an enumeration for the title rather than a freeform, or validation that the title is one of the “admissible” ones.

You could call this a straw man argument, but the truth is that it didn’t take me any time at all to find an example tutorial (See also Archive.is, as I hope the live version can be fixed!) that did exactly that.

Similarly, I have seen sample tutorial code explaining how to write authentication primitives that oversimplify the procedure by either ignoring the salt-and-hashing or using obviously broken hashing functions such as crypt() rather than anything solid. Given many of us know all too well how even important jobs that are not flashy enough for a “rockstar” can be pushed into the hands of junior developers or even interns, I would not be surprised if a good chunk of these weak authentication problems that are now causing us so much pain are caused by simple bad practices that are (still) taught to those who join our profession.

I am afraid I don’t have an answer of how to fix this situation. While musing, again on Twitter, the only suggestion for a good text on writing correct authentication code is the NIST recommendations, but these are, unsurprisingly, written in a tone that is not useful to teach how to do things. They are standards first and foremost, and they are good, but that makes them extremely unsuitable for newcomers to learn how to do things correctly. And while they do provide very solid ground for building formally correct implementations of common libraries to implement the authentication — I somehow doubt that most systems would care about the formal correctness of their login page, particularly given the stories we have seen up to now.

I have seen comments on social media (different people on different media) about what makes a good source of documentation changes depending on your expertise, which is quite correct. Giving a long list of things that you should or should not do is probably a bad way to introduce newcomers to development in general. But maybe we should make sure that examples, samples, and documentation are updated so that they show the current best practice rather than overly simplified, or artificially complicated (sometimes at the same time) examples.

If you’re writing documentation, or new libraries (because you’re writing documentation for new libraries you write, right?) you may want to make sure that the “minimal” example is actually the minimum you need to do, and not skip over things like error checks, or full initialisation. And please, take a look at the various “Falsehoods Programmers Believe About” lists — and see if your example implementation make those assumptions. And if so fix them, please. You’ll prevent many mistakes from happening in real world applications, simply because the next junior developer who gets hired to build a startup’s latest website will not be steered towards the wrong implementations.

Share this: