Making it easy to contribute (code)

Now that I left one bubble, and before I join the next one, I thought it would be a good time to discuss one thing that would otherwise be perceived as a shill by many (I’m sure it still will, but at least I gave it a shot): how to make it easier for employers of big corporations (case in point, Google as I just left that) to make code contributions to your project.

I’ve heard complains, and in some cases complained myself, about big companies not contributing back improvements, or deciding to “do their own thing” rather than contribution to an already existing project. But after spending seven years working for them, I have a clearer idea that there’s a few things that the projects can do, to make it easier to take contributions. And because I don’t want to be thought as I’m talking of secret information, I’ll be talking about it while referencing the Google Open Source Docs site, which is a (nearly complete) mirror of the documentation available to Googlers.

First of all, as the landing page states, there is some content missing on that site, so that’s not secret either. I no longer have access to the internal version, so I can’t even compare it right now, but the times I have consulted it, the missing content is really not relevant to external discussion. No secret cabals hidden there.

So the first obvious thing is that you need a license. This is something that effectively everybody has been saying forever. No license in a repository means there’s no legal way to use, distribute, or contribute to the code. And pretty much every company out there will forbid its employees from contributing to such projects. Google is quite explicit on this, and includes a (strangely non-comprehensive) list of forbidden licenses — it goes into further details with the list of banned software licenses, that include non-open source licenses too.

There’s a funny tidbit here — Google bans both WTFPL and CC-0 licenses — but explicitly allows Unlicense without review (assuming a bunch of other requirements are met). This is in contrast to the Fedora Project, that goes the other direction, recommending CC-0 over Unlicense. Personally, with the objective of this post, I would suggest the MIT license — it provides pretty much the same widely permissive license, and is accepted by effectively everybody that I know of. It definitely covers both Google and Fedora Project at the same time.

Update (2020-06-09): it was brought to my attention that a week or two after this post was published, Google’s Open Source documentation was amended to include Unlicense together with CC-0 and most other public domain dedications, except for US Government works and BSD0. I’ll keep my advice for using MIT for those files, if at all possible.

When it comes to licensing repositories, remember to license the repository with websites, too. Some time ago I wanted to send a correction for pdfreaders.org, but I couldn’t because the repository for the site didn’t have a license. Oops! (I believe this was fixed.)

At this point, with the right license, most open source projects are “patchable” — as in, you can provide a patch to the right reviewers and they can give you a green light (or tell you why you can’t do it). This is not particularly difficult work, but it is a bit toily, and it assumes that you’re able to write a patch and get it reviewed quickly — it makes the difference between sending a patch being around ten minutes at once, to be about half an hour to prepare, and then wait a day or two.

In particular for me the problem with this workflow had been transferring the patch from my Linux development workstation into my corp machine to be able to fill in the form. You can’t use GitHub private repositories either, or email it to yourself — what I used to do was to use Google Drive to share it with my corporate account, but that’s also a bit painful. I have given up a few times on some patches because of this reason, until…

Eventually, in the seven years I worked at Google, the policy relaxed a bit, and in particular, there is no review necessary for patching a subset of public GitHub repositories. You still need to make sure the patch is to a project with an accepted license (actually, a subset of licenses), and not be one of the banned projects, or one of the projects that need SVP approval (you really don’t want to wait on an SVP, from what I have been told), but then you can patch most GitHub repositories easily.

What this means in practice is that while I was working on usbmon-tools, I could keep patching python-pcapng — but I would have had to ask for review to patch hexdump: the latter has an acceptable license (Unlicense), but it’s not hosted on GitHub, but rather on BitBucket. The same is true for GitLab.

I’m not quite clear why this difference, but there you go. My recommendation here is to have at least a GitHub mirror — it doesn’t really cost much to set up mirroring, and it makes it easier for Googlers (and I’m sure other corporate employees as well), and for newcomers who might only have seen GitHub up to then.

There is one more thing that I would suggest you to do. There’s an additional bit of documentation around having an AUTHORS file. I have followed this advice for most of my projects, whether released under Google’s open sourcing policy or not. It makes it much easier to make sure that everybody is credited, while not having to keep adding more and more copyright lines into the files. It also matches well with the advice given by Matija, when it comes to include SPDX metadata to the files themselves.

With all these (small) things considered, I would say that a project is in the best state possible to accept code contributions from “corporate actors” — or at least Googlers. I think this approach, of documenting clearly (or as clear as it’s feasible) how to contribute to open source project, is a great starting point — it makes it possible for projects to not unwittingly put up barriers to contributions. And at the same time it gives the chance for those projects who really don’t want corporate contributions to set up as many barriers as they want. I’m sure it’s a signal that some find positive, I really don’t, but that’s just me.

2 thoughts on “Making it easy to contribute (code)

  1. I have a habit of opening issues on github asking authors to add LICENSE file. Sadly, a lot of folks do that without consulting all the previous contributors making this dubious.

    Another annoying part is people using “clever” licenses. WTFPL is at least understandable (although problematic). There are licenses like “only use this code for good” or “buy me a beer” which are legally impossible to use.

    Like

    1. The one time I was in a location, and had enough spare time, to buy Linus Torvalds a pizza and a beer, he didn’t answer his phone (nor did he turn up at the beer&pizza venue 4-5 hours later, we did tell his answerphone where and when we’d be). But, at taht point, I was not acting on behalf of any corporate actor, it was just me and my mates taking a few days of break in Helsinki.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s