The challenge of training new developers

Since my public explaining of my issues with Gentoo I haven’t seen much changes in the air; even though I think there is enough of a consensus among users and developer alike that staying “bleeding edge” does not have to force us to replace “unstable” with “unusable”, there is the (usual, I got to say) inertia for changing things around. While I won’t be revealing anything posted on the private gentoo-core mailing list, I would like to note that neither of the two developers I blamed, both privately and publicly, decided to respond to the criticism. And actually the latter kept on doing what he was doing before: “fixing” stuff he’s not using nor testing.

And of course, QA hasn’t been moving to address the problem, nor did devrel as far as I can see. Does this give me any more trust in Gentoo? No, it actually saps it away, and is moving me day after day toward the edge of really give the finger to the whole lot. While I haven’t decided yet what to do, it really makes me wonder if it’s worth the time I’m pouring into this. Having the tinderbox churn away, mostly at my private expenses, while somebody decides to wreak havoc throughout the tree just to get things more “bleeding” doesn’t really appeal to me that much, even with all the users suggesting me to keep doing my work so that it doesn’t get worse. And for those who wonder, no I’m definitely not going to work along with Robbins, unless he’d be paying me to do that, and a lot as well; simply because I don’t think we could ever agree with what QA means: while the Gentoo Ruby team was pouring work for Ruby 1.9, Robbins thought that the idea was to just make Ruby 1.9 the default. Great move indeed.

And for those who wonder, yep, both developer A and B are easy to spot, since I named the packages they touched explicitly. My intention was not to take cheap shots at them, but rather a way to avoid smearing them; not with users, that in my opinion should be warned about the developers they entrust with their systems, but rather with search engines. If I were to name the developers explicitly, it might very well be that a search for their names in Google would turn up my posts; I’m pretty sure that possible future employers might not like what I wrote too much. Since I trust they might actually be better people and developers in their place of employment, I’d rather not add extra risks to them. You know I usually don’t have trouble naming names, if you follow me for long enough.

But before speaking more about leaving, let me try to dissect one further problem with the Gentoo ecosystem as it is now. We all know that we have problems with a shortage of developers. Indeed, most teams are understaffed, and that’s why we’re no longer bleeding edge. For this reason, quite a bit of people seems to be upset at the idea of getting developers out of the pool, for the pure matter of reducing the number of committers. While I could go on arguing about the quality-over-quantity topic (having developer B touching stuff he has no clue about is not good for the health of Gentoo), I have to accept that for some things, quantity is important: we cannot restrict working on ebuilds to people that use them if we don’t have enough people that use the stuff in the tree, which is why we have to accept at least some partial compromise on that ground.

So, why do we have this shortage of developers? We sure have no shortage of users, nor of ebuild submitters, nor of overlays to “breed” the developers. Well, there are in my opinion two main issues with the “be a developer” process. The first is that we have too few recruiters, the guys who actually make the developers; the second is that we have no real way to train those developers at all. These problems sit square with DevRel and QA respectively.

For what concerns recruiters, I’m going to state one thing that I think I stated before, but not sure if I stated it publicly enough: there is just too much red tape. There, I said it. With all due respect to the late Ferris, he was a great developer for the SPARC team, but his own background (being a lawyer) added up to the DevRel team so much that it really seem to be bureaucracy on par with most government agencies. I have no doubt that this is also why Bryan (kloeri) decided to go a different route with Exherbo (and I name another “taboo”; no I’m definitely not even going to consider it — again it’s a social thing, while I’m probably going to agree on a few points, some of its developers have positively engaged on very personal insults against me, and I would never entrust them even with a calculator). In the particular case of recruiters, one of the requirements to become one is to review your quizzes again; I’ll be on to the quizzes in a moment, but for now let’s just put in evidence that while it might be helpful for some developers to review those quizzes (trust me, developer B broken at least one or two of the questions in them, repeatedly, without taking a hit), it is definitely something that bothers you quite a bit.

So back to the quizzes: they currently stop both current developers to become recruiters and new developers to be created. What’s the problem with these quizzes? Well the first is that they are in DevRel’s hands. That’s a problem because DevRel’s task should be of handling developers’ relations and not technical issues. The Council has given the task of handling technical issues with the tree to the QA project instead, together with assigning the task of documenting such policies; why are the quizzes still written by DevRel then? And especially, does DevRel and QA really speak about those quizzes?

The answer here is easy if I’m allowed to be blunt: QA does not have the cojones to deal with that stuff. And that includes partly me as well. The quizzes as they are now are not really that good, not because the recruiters (Petteri in particular) haven’t been working hard on them, but rather because they come out of the wrong place. Let me name a few particular problems with them:

  • they are mostly undocumented: not all the questions on the two quizzes have readily available answers in the documentation; I’ll get to the state of technical documentation later on this post, but for now let me assure you that there are a few questions that the mentors have to explain (or feed) directly to the person being recruited as they won’t be able to find it by themselves;
  • they mix organisation and technical questions, code- and ebuild-related questions: “What is the devaway system?”, “What is the difference between DEPEND and RDEPEND?”, “You find a package that will not build on some architectures without PIC (-fPIC) code in the shared libraries. What is the proper way to handle this situation?”. The importance of these three questions in Gentoo is actually quite varying; it’s definitely misleading to call the quiz “ebuild quiz” when you have this kind of things;
  • they are not really two quizzes; by their name, “Ebuild Quiz” and “End of Mentoring Quiz” you’d expect two vastly different quizzes that can be answered only in different times; in reality, most candidates submit the two of them at the same time, to both their mentors and the recruiters;
  • they keep on growing on unnecessary details, and at the same time they are not dropping now-pointless details, like I’ll try to show in a moment; this is probably due to the first problem: lack of actual documentation and guidelines calls for the quizzes to replace the documentation, but that’s definitely not going to work on the long term.

Documentation is the keyword on all of this. Lack of documentation obviously means no way to inform the new developers of what they should or shouldn’t do. The current documentation is scattered among the website, and that makes it very difficult to consult. The devmanual that Ciaran started was supposed to remove that trouble; unfortunately it hasn’t been touched vastly for years (by me as well, to be honest — there is still a very old PAM documentation entry that I wrote myself and submitted to Ciaran at the time, which is nowadays vastly outdated), and it contains too much “reference” and too little explaining for the new stuff (I can say lots of bad stuff about Ciaran, but I remember he starting explaining stuff rather than just adding references). Besides, it contains quite a bit of objectionable suggestions that are frown upon by the current QA team and by quite a few developers.

In the current form, by the way, the DevManual is in my opinion unmanageable. Sorry Tim, but while I’m not a huge fan of RST myself, what we have now is an abomination: it uses the basic GuideXML syntax, with a custom stylesheet, but also a lot of custom markup! The repository has something like over 130 files named text.xml (I don’t know yours, but my editor of choice, Emacs, shows me the base file name when switching between them, and with all of them having the same name is almost impossible to find the right one right away!) with the diagrams being diagram.svg, all the pages use “index.html” as final name. I won’t make it a mystery that I don’t like it at all; I actually started to work – with Equilibrium, an Italian user/AT – on a DocBook-based manual, porting the content, still in XML, to DocBook 5: no markup syntax extension was needed, it actually allowed a much more flexible grammar for the content, and it uses a standard syntax, not some concoction that looks similar to a Gentoo-specific XML format. We’re still going to ask contributors to learn an XML markup language; should we ask them to learn something that is not used at all outside Gentoo, or something that is being used in real life?

Now, let me try to show you the problems with a few of the questions in our quizzes, which is something I’m afraid people, especially in DevRel, will frown upon.

15. You find a package that will not build on some architectures without PIC (-fPIC) code in the shared libraries. What is the proper way to handle this situation?

This question is very deeply technical, I wrote about PIC many times so I definitely agree this is something we should pay attention to. On the other hand, I’m pretty sure that the existence of this question in the quizzes predates the introduction of AMD64 as a daily-basis architecture; I’m pretty sure the question was there when I first took the quizzes about five years ago. The reason why I say so is that it was indeed something very difficult to grasp for people that only ever looked at x86 and that very rarely would have the chance to deal with the problem at hand; Position Independent Code in shared libraries was, five years ago, a topic well-known just for Alpha and a few other niche architectures; then AMD64 became mainstream, and PIC is now the norm. The amount of projects trying to build non-PIC shared objects also reached almost zero (for the new stuff) for that very same reason. By the way, this can be helped to solve with the help of the PIC fixing guide (which is part of the Hardened project, neither DevRel, nor QA).

6.e

# Use an alternative implementation instead of the default depending
# on the foo use flag.
DEPEND="foo? ( cat-foo/alternative ) : ( cat-foo/default )"

Yet another question older than me as a developer… this one deals with a common construct used in programming languages called “ternary operator” (: ? in C and similar languages). Many people have suspected that the reason for this question is simply to make sure developers knowing other languages won’t be confused by that operator; in truth, that’s not the case. If you look at this devmanual page under the latest paragraph “Legacy Inverse USE-Conditional Dependency Syntax” you’ll see this format described as a legacy, no longer valid form. Since when is it not valid? I cannot tell you for sure but it was already not valid and discouraged over five years ago when I took the tests and is not supported by any modern implementation of ebuild-based package managers. The very fact of keeping this documented and questioned about makes no sense at all!

8. Why are ‘head -5’ and ‘tail -5’ bad? What should be used instead?

Another question that makes sense only in the sense of history. The use of head -5 and tail -5 has been proactively discouraged for a while, as you can also note by the presence of fixheadtails.eclass that fixes those instances; while this is indeed correct, in the sense that POSIX specifies a different syntax for head and tail, this wasn’t much of a problem until GNU coreutils implemented a warning about the old syntax, which caused so much noise that developers explicitly went and converted all the usage of those two commands, both in ebuilds and in the scripts used by upstream. That warning went away, and while it’s still a good idea to stick to POSIX-compatible syntax, the quiz seems to stress the importance only on head and tail. Instead the most commonly misused command is likely to be find (as the GNU syntax, the FreeBSD syntax and the POSIX syntax are quite different one from the other), with sed being hardwired to GNU sed as the other implementations are much different.

I took three examples of obsoleted, outdated questions, which might as well be changed or removed altogether if I asked to. On the other hand, I’d like to propose an alternative approach to the whole situation. Given that the quizzes are almost always submitted together nowadays, I’d propose merging the technical questions all together, and split out the organisational questions: you get an organisational and an ebuild quiz. Further points if the former is written by DevRel, and the latter by QA. Complete with (recruiter-private?) cheatsheet of the important things that the recruits need to note for each answer.

Even better, let’s try to fit into the whole system what a lot of other teams have been doing, including the Sunrise team: they are now requesting for just the (“easier”) ebuild quiz to be submitted by the recruits to get their overlay commit access. But rather than making it an ebuild quiz and an end of mentoring quiz like they are today, let’s make them an ebuild quiz, dealing with the ebuild-related questions only and a “upstream code maintenance” quiz, dealing with problems like automagic dependencies, PIC code, and so on so forth. Wouldn’t that make more sense than the current situation?

10 thoughts on “The challenge of training new developers

  1. It should be noted that Recruiters is a pretty much independent DevRel sub project. It would make much more sense to talk about Recruiters instead of Devrel as that more accurately describes the people involved. Anyone is free to submit improvements to the recruitment process but anytime I ask I get nothing.

    Like

  2. As I told you at FOSDEM, the first step is to get that documentation problem fixed, in my opinion nothing can be done until *that* is fixed. Maybe I didn’t stress enough the fact that all I wrote stems down on the lack of a updated, consolidated devmanual.

    Like

  3. You’re probably used to a lot of “I agree with you”-replies, but fwiw – here’s mine:I’ve been a sunrise-commiter for some months now, have started to find my place in Gentoo when it comes to herds and people i socialize with. I now feel that the next natural step would be joining Gentoo so I can contribute back even more.I’ve witnessed many of the things you discuss on a daily basis, but documentation is by far the thing I lack most. Every time I’m about to bump or write new ebuilds, help out with bugs or whatnot – I end up looking at devmanual and get frustrated by the fact that it lacks information about what I’m looking for. My next move is usually looking at other ebuilds – but while grepping through them, they have different way of solving the same issues, which makes it hard for me to know what’s “right” and “wrong”. In such cases I try to stick with the solution with highest readability – but I’m still unsure that it is the correct one.When it comes to introducing new stuff and features in portage, I’m even more confused. These things usually are announced on the -dev list (which is nice for heads up), but I have no idea where and when they end up.

    Like

  4. Johan, I couldn’t agree more with you. I proxy maintain some packages – without help of seasoned devs who check my ebuilds (for me its Diego here and Scarabeus, both on the QA team), I’d be often left clueless how to properly deal with stuff. Gentoo documentation just about on anything is quite cluttered, partially outdated and occasionally even inconsistent and misleading …Diego’s voice should IMHO get far more attention and real-live application in the gentoo ecosystem.

    Like

  5. Paradoxically, complaining about lack of developers means there is nobody listening who can do anything about your complaints.

    Like

  6. Maybe officially endorsing sunrise would be useful. That would lead to a structure of “junior devs” (commit right in sunrise) and “senior devs”.Also, instead of having everything in questions (or additionally to that) it might be useful to have a list of things/problems you should ask experiences developers about.“If you should happen to stumble upon one of these these, please come to #gentoo-complex-issues and ask experienced devs for their input: * A* B* C”

    Like

  7. I think the quizzes are quite a stupid idea. These may be alright to test (or tease) a freshman looking to help out on his favorite distro. But a seasoned developer will often not bother to suit Gentoo bureaucracy and in consequence not be able to commit his fixes back. Look at how many working ebuilds end up in bugzilla attachments. Forever. Sunrise is a half-baked approach. Personal overlays are somewhat better (similar to Exherbo’s repositories) but Ubuntu PPAs are still much closer to a usable solution.

    Like

  8. @Arne It doesn’t seem a good solution to have a quiz ask people to attack #gentoo-dev-help.The only way to learn to deal with issues is to work with existing ebuilds that have issues and fix them or to write an ebuild that has to deal with these issues. Sunrise does not require a quiz for commit access, but it handing in the quiz does waive the pre-commit review. Sunrise’s ebuild review process, requiring a user to revise his ebuild until it meets sunrise users’ and at least one dev’s standards, is very useful for learning. Ttheir own documentation at http://overlays.gentoo.org/… provide new ebuild writers a good shot of QA.@daniel I do not think that completely discarding the quizes would help much. Even an amount of ebuild experience with Sunrise, etc., will not expose one to certain situations which a dev should know about. The quizes seem to cover some of these.

    Like

  9. Ditto to Johan’s comment.Basically, the Gentoo documentation is likely not easily updatable or rarely maintained.Probably only the original author has write access? lol. But you have to admit, it has gottena lot better!An ebuild is basically just a script. Although there are many methods of doing the same thing,I use my C coding skills as a guideline instead when writing ebuilds.1) Readability.2) Optimize where I can, while enhancing readability.ie. Define variables once at the top of the script instead of within each function.This reduces CPU cycles and enhances readability at the same time.Ditto with Sunrise too.Stuff should be more open, allowing more volunteers to commit vs. the extremely restrictive method currently used.Sunrise was suppose to be a solution for allowing more flexibility when submitting ebuilds, butseems more like Gentoo’s Portage.

    Like

  10. Probably the most hope for change here lies in moving the documentation to a git repository which seems to have some impetus; it is much easier for a user to clone and push to their tree and ask for a merge than to go through all the steps necessary to git cvscommit access

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s