The challenge of training new developers

Since my public explaining of my issues with Gentoo I haven’t seen much changes in the air; even though I think there is enough of a consensus among users and developer alike that staying “bleeding edge” does not have to force us to replace “unstable” with “unusable”, there is the (usual, I got to say) inertia for changing things around. While I won’t be revealing anything posted on the private gentoo-core mailing list, I would like to note that neither of the two developers I blamed, both privately and publicly, decided to respond to the criticism. And actually the latter kept on doing what he was doing before: “fixing” stuff he’s not using nor testing.

And of course, QA hasn’t been moving to address the problem, nor did devrel as far as I can see. Does this give me any more trust in Gentoo? No, it actually saps it away, and is moving me day after day toward the edge of really give the finger to the whole lot. While I haven’t decided yet what to do, it really makes me wonder if it’s worth the time I’m pouring into this. Having the tinderbox churn away, mostly at my private expenses, while somebody decides to wreak havoc throughout the tree just to get things more “bleeding” doesn’t really appeal to me that much, even with all the users suggesting me to keep doing my work so that it doesn’t get worse. And for those who wonder, no I’m definitely not going to work along with Robbins, unless he’d be paying me to do that, and a lot as well; simply because I don’t think we could ever agree with what QA means: while the Gentoo Ruby team was pouring work for Ruby 1.9, Robbins thought that the idea was to just make Ruby 1.9 the default. Great move indeed.

And for those who wonder, yep, both developer A and B are easy to spot, since I named the packages they touched explicitly. My intention was not to take cheap shots at them, but rather a way to avoid smearing them; not with users, that in my opinion should be warned about the developers they entrust with their systems, but rather with search engines. If I were to name the developers explicitly, it might very well be that a search for their names in Google would turn up my posts; I’m pretty sure that possible future employers might not like what I wrote too much. Since I trust they might actually be better people and developers in their place of employment, I’d rather not add extra risks to them. You know I usually don’t have trouble naming names, if you follow me for long enough.

But before speaking more about leaving, let me try to dissect one further problem with the Gentoo ecosystem as it is now. We all know that we have problems with a shortage of developers. Indeed, most teams are understaffed, and that’s why we’re no longer bleeding edge. For this reason, quite a bit of people seems to be upset at the idea of getting developers out of the pool, for the pure matter of reducing the number of committers. While I could go on arguing about the quality-over-quantity topic (having developer B touching stuff he has no clue about is not good for the health of Gentoo), I have to accept that for some things, quantity is important: we cannot restrict working on ebuilds to people that use them if we don’t have enough people that use the stuff in the tree, which is why we have to accept at least some partial compromise on that ground.

So, why do we have this shortage of developers? We sure have no shortage of users, nor of ebuild submitters, nor of overlays to “breed” the developers. Well, there are in my opinion two main issues with the “be a developer” process. The first is that we have too few recruiters, the guys who actually make the developers; the second is that we have no real way to train those developers at all. These problems sit square with DevRel and QA respectively.

For what concerns recruiters, I’m going to state one thing that I think I stated before, but not sure if I stated it publicly enough: there is just too much red tape. There, I said it. With all due respect to the late Ferris, he was a great developer for the SPARC team, but his own background (being a lawyer) added up to the DevRel team so much that it really seem to be bureaucracy on par with most government agencies. I have no doubt that this is also why Bryan (kloeri) decided to go a different route with Exherbo (and I name another “taboo”; no I’m definitely not even going to consider it — again it’s a social thing, while I’m probably going to agree on a few points, some of its developers have positively engaged on very personal insults against me, and I would never entrust them even with a calculator). In the particular case of recruiters, one of the requirements to become one is to review your quizzes again; I’ll be on to the quizzes in a moment, but for now let’s just put in evidence that while it might be helpful for some developers to review those quizzes (trust me, developer B broken at least one or two of the questions in them, repeatedly, without taking a hit), it is definitely something that bothers you quite a bit.

So back to the quizzes: they currently stop both current developers to become recruiters and new developers to be created. What’s the problem with these quizzes? Well the first is that they are in DevRel’s hands. That’s a problem because DevRel’s task should be of handling developers’ relations and not technical issues. The Council has given the task of handling technical issues with the tree to the QA project instead, together with assigning the task of documenting such policies; why are the quizzes still written by DevRel then? And especially, does DevRel and QA really speak about those quizzes?

The answer here is easy if I’m allowed to be blunt: QA does not have the cojones to deal with that stuff. And that includes partly me as well. The quizzes as they are now are not really that good, not because the recruiters (Petteri in particular) haven’t been working hard on them, but rather because they come out of the wrong place. Let me name a few particular problems with them:

  • they are mostly undocumented: not all the questions on the two quizzes have readily available answers in the documentation; I’ll get to the state of technical documentation later on this post, but for now let me assure you that there are a few questions that the mentors have to explain (or feed) directly to the person being recruited as they won’t be able to find it by themselves;
  • they mix organisation and technical questions, code- and ebuild-related questions: “What is the devaway system?”, “What is the difference between DEPEND and RDEPEND?”, “You find a package that will not build on some architectures without PIC (-fPIC) code in the shared libraries. What is the proper way to handle this situation?”. The importance of these three questions in Gentoo is actually quite varying; it’s definitely misleading to call the quiz “ebuild quiz” when you have this kind of things;
  • they are not really two quizzes; by their name, “Ebuild Quiz” and “End of Mentoring Quiz” you’d expect two vastly different quizzes that can be answered only in different times; in reality, most candidates submit the two of them at the same time, to both their mentors and the recruiters;
  • they keep on growing on unnecessary details, and at the same time they are not dropping now-pointless details, like I’ll try to show in a moment; this is probably due to the first problem: lack of actual documentation and guidelines calls for the quizzes to replace the documentation, but that’s definitely not going to work on the long term.

Documentation is the keyword on all of this. Lack of documentation obviously means no way to inform the new developers of what they should or shouldn’t do. The current documentation is scattered among the website, and that makes it very difficult to consult. The devmanual that Ciaran started was supposed to remove that trouble; unfortunately it hasn’t been touched vastly for years (by me as well, to be honest — there is still a very old PAM documentation entry that I wrote myself and submitted to Ciaran at the time, which is nowadays vastly outdated), and it contains too much “reference” and too little explaining for the new stuff (I can say lots of bad stuff about Ciaran, but I remember he starting explaining stuff rather than just adding references). Besides, it contains quite a bit of objectionable suggestions that are frown upon by the current QA team and by quite a few developers.

In the current form, by the way, the DevManual is in my opinion unmanageable. Sorry Tim, but while I’m not a huge fan of RST myself, what we have now is an abomination: it uses the basic GuideXML syntax, with a custom stylesheet, but also a lot of custom markup! The repository has something like over 130 files named text.xml (I don’t know yours, but my editor of choice, Emacs, shows me the base file name when switching between them, and with all of them having the same name is almost impossible to find the right one right away!) with the diagrams being diagram.svg, all the pages use “index.html” as final name. I won’t make it a mystery that I don’t like it at all; I actually started to work – with Equilibrium, an Italian user/AT – on a DocBook-based manual, porting the content, still in XML, to DocBook 5: no markup syntax extension was needed, it actually allowed a much more flexible grammar for the content, and it uses a standard syntax, not some concoction that looks similar to a Gentoo-specific XML format. We’re still going to ask contributors to learn an XML markup language; should we ask them to learn something that is not used at all outside Gentoo, or something that is being used in real life?

Now, let me try to show you the problems with a few of the questions in our quizzes, which is something I’m afraid people, especially in DevRel, will frown upon.

15. You find a package that will not build on some architectures without PIC (-fPIC) code in the shared libraries. What is the proper way to handle this situation?

This question is very deeply technical, I wrote about PIC many times so I definitely agree this is something we should pay attention to. On the other hand, I’m pretty sure that the existence of this question in the quizzes predates the introduction of AMD64 as a daily-basis architecture; I’m pretty sure the question was there when I first took the quizzes about five years ago. The reason why I say so is that it was indeed something very difficult to grasp for people that only ever looked at x86 and that very rarely would have the chance to deal with the problem at hand; Position Independent Code in shared libraries was, five years ago, a topic well-known just for Alpha and a few other niche architectures; then AMD64 became mainstream, and PIC is now the norm. The amount of projects trying to build non-PIC shared objects also reached almost zero (for the new stuff) for that very same reason. By the way, this can be helped to solve with the help of the PIC fixing guide (which is part of the Hardened project, neither DevRel, nor QA).


# Use an alternative implementation instead of the default depending
# on the foo use flag.
DEPEND="foo? ( cat-foo/alternative ) : ( cat-foo/default )"

Yet another question older than me as a developer… this one deals with a common construct used in programming languages called “ternary operator” (: ? in C and similar languages). Many people have suspected that the reason for this question is simply to make sure developers knowing other languages won’t be confused by that operator; in truth, that’s not the case. If you look at this devmanual page under the latest paragraph “Legacy Inverse USE-Conditional Dependency Syntax” you’ll see this format described as a legacy, no longer valid form. Since when is it not valid? I cannot tell you for sure but it was already not valid and discouraged over five years ago when I took the tests and is not supported by any modern implementation of ebuild-based package managers. The very fact of keeping this documented and questioned about makes no sense at all!

8. Why are ‘head -5’ and ‘tail -5’ bad? What should be used instead?

Another question that makes sense only in the sense of history. The use of head -5 and tail -5 has been proactively discouraged for a while, as you can also note by the presence of fixheadtails.eclass that fixes those instances; while this is indeed correct, in the sense that POSIX specifies a different syntax for head and tail, this wasn’t much of a problem until GNU coreutils implemented a warning about the old syntax, which caused so much noise that developers explicitly went and converted all the usage of those two commands, both in ebuilds and in the scripts used by upstream. That warning went away, and while it’s still a good idea to stick to POSIX-compatible syntax, the quiz seems to stress the importance only on head and tail. Instead the most commonly misused command is likely to be find (as the GNU syntax, the FreeBSD syntax and the POSIX syntax are quite different one from the other), with sed being hardwired to GNU sed as the other implementations are much different.

I took three examples of obsoleted, outdated questions, which might as well be changed or removed altogether if I asked to. On the other hand, I’d like to propose an alternative approach to the whole situation. Given that the quizzes are almost always submitted together nowadays, I’d propose merging the technical questions all together, and split out the organisational questions: you get an organisational and an ebuild quiz. Further points if the former is written by DevRel, and the latter by QA. Complete with (recruiter-private?) cheatsheet of the important things that the recruits need to note for each answer.

Even better, let’s try to fit into the whole system what a lot of other teams have been doing, including the Sunrise team: they are now requesting for just the (“easier”) ebuild quiz to be submitted by the recruits to get their overlay commit access. But rather than making it an ebuild quiz and an end of mentoring quiz like they are today, let’s make them an ebuild quiz, dealing with the ebuild-related questions only and a “upstream code maintenance” quiz, dealing with problems like automagic dependencies, PIC code, and so on so forth. Wouldn’t that make more sense than the current situation?