Unfriendly open source projects

While working on the RaspberryPi setup, which I decided is going to use wview as the main component, I’ve started wondering once again why some upstream projects really feel like they don’t want to be friendly with distributions and other developers at all.

It’s not like they are hostile, just really unfriendly.

First of all, wview comes with its own utility library (more on that in a different post); both projects don’t have a public repository, they are developed behind closed doors, but you can send patches. This is bothersome but not a big deal. But when you send patches, and it goes weeks without answers, and without any idea if the patch was merged, well, things get bothersome.

But this is not the only project lately that seems to have this kind of problems. Google’s PageSpeed has binary package snapshots for Debian and CentOS but they have no related source release. Indeed, if you want the sources, you’re supposed to fetch them from the repository using Google’s own repo tools, rather than any standard SCM software. While I was interested in trying the module, the idea of having to deal with repo tools once again didn’t enthusiasm me, so I stopped looking into it at all.

This, to me, looks like a common problem: who cares about the policies that distribution had for years, make it agile, make it easy, and stop doing releases, git is good for everybody and so is bzr, hg and monotone! And then the poor distribution packages need to suffer.

Who cares about pkg-config and the fact that it supports cross-compilation out of the box? Just rely on custom -config scripts!

Why waste time making things work once installed at system level? They work good enough out of a repository!

Sigh.

Who consumes the semantic web?

In my previous post I’ve noted that I was adding support for the latest fad method for semantic tagging of data on web pages, but it was obviously not clear who actually consumes that data. So let’s see.

In the midst of the changes to Typo that I’ve been sending to support a fully SSL-compatible blog install (mine is not entirely yet, mostly because most of the internal links from one post to the next are not currently protocol-relative), I’ve added one commit to give a bit more OpenGraph insights — OpenGraph is used by Facebook, almost exclusively. The only metadata that I provide on that protocol, though, is an image for the blog – since I don’t have a logo, I’m sending my gravatar – the title of the single page and the global site title.

Why that? Well, mostly because this way if you do post a link to my blog on facebook, it will appear with the title of the post itself instead of the one that is visible on the page. This solves the problem of whether the title of the blog itself should be dropped out of the <title> tag.

For what concerns Google, instead, the most important part of metadata you can provide them seems to be authorship tagging which uses Google+ to connect content of the same author. Is this going to be useful? Not sure yet, but at least it shows up in a less anonymous way in the search results, and that can’t be bad. Unlike what they say on the link, it’s possible to use an invisible <link> tag to connect the two, which is why you don’t find a G+ logo on my blog anywhere.

What else do search engines do with the remaining semantic data? Not sure, it doesn’t seem to explain it, and since I don’t know what it does behind the scenes it’s hard for me to give a proper answer. But I can guess, and hope, that they use it to reduce the redundancy of the current index. For instance, pages that are actually a list of posts, such as the main index, the categories/tags and archives will now properly tell that they are describing a blog posting whose URL is, well, somewhere else. My hope would be for the search engines to know then to link to the declared blog post’s URL instead of the index page. And possibly boost the results for the posts that result more popular (given they can then count the comments). What I’m surely counting on, is for descriptions in search results to be more humanly-centered.

Now in the case of Google you can use their Rich Snippet testing tool that gives you an idea of what it finds. I’m pretty sure that they take all this data with a grain of salt though, seeing as how many players are there in the “SEO” world, with people trying to game the system altogether. But at least I can hope that things will move in the right direction.

Interestingly, when I first implemented the new semantic data, Readability did not support it, and would show my blog’s title instead of the post’s title when reading the articles from there — after a feedback on their site they added some workaround for my case, so you can enjoy their app with my content just fine. Hopefully, with time, the microformat will be supported in the general sense.

On the other hand, Flattr still has no improvement on using metadata, as far as I can see. They require that you actually add a button manually, including repeating that kind of metadata (content type, language, tags) that is already easily inferred from the microformat given. Hereby, I’d like to reiterate my plea to Flattr developers to listen to OpenGraph and other microformat data, and at least use that to augment the manually-inserted buttons. Supporting the schema.org format, by the way, should make it relatively easy to add per-fragment buttons — i.e., I wouldn’t mind having a per-comment Flattr button to reward constructive comments, like they have on their own blog, but without the overhead that it adds to do so manually.

Right now this is all the semantic data that I figured out that is being used. Hopefully things will become more useful in the future.

It’s that time of the year again…

Which time of the year? The time when Google announces the new Summer of Code!

Okay so you know I”m not always very positive about the outcome of Summer of Code work, even though I’m extremely grateful to Constanze (and Mike, who got it i tree now!) for the work on filesystem based capabilities — I’m pretty sure at this point that it also has been instrumental for the Hardened team to have their xattr-based PaX marking (I’m tempted to re-consider Hardened for my laptops now that Skype is no longer a no-go, by the way). Other projects (many of which centred around continuous integration, with no results) ended up in much worse shape.

But since being always overly negative is not a good way to proceed in life, I’m going to propose a few possible things that could be useful to have, both for Gentoo Linux and libav/VLC (whichever is going to be part of GSoC this year). Hopefully if something comes out of them is going to be good.

First of all, a re-iteration of something I’ve been asking of Gentoo for a while: a real alternatives-like system. Debian has a very well implemented tool for selecting among alternative packages supporting multiple tools. In Gentoo we have eselect — and a bunch of modules. My laptop counts 10 different eselect packages installed, and for most of them, the overhead of having another package installed is bigger than the eselect module itself! This also does not really work that well, as for instance you cannot choose the tar command, and pkg-config vs pkgconf require you to make a single selection by installing one or the other (or disabling the flag from pkgconf, but that defeats the point, doesn’t it?).

Speaking of eselect and similar tools, we still have gcc-config and binutils-config as their own tools, without using the framework that we use for about everything else. Okay, the last guy who tried to make these bit more than he could chew, and the result has been abysmal, but the reason there is likely that the target was set too high: re-do the whole compiler handling so that it could support non-GCC compilers.. this might actually be too small a project for GSoC, but might work as a qualification task, similar to the ones we’ve got for libav in the past.

Going to libav, one thing that I was discussing with Luca, J-B and other VLC developers, was the possibility to integrate at least part of the DVD handling that is currently split between libdvdread and libdvdnav into libav itself. VLC already forked the two libraries (and I rewrote the build system) — me and Luca were looking into merging them back into a single libdvd library already… but a rewrite and especially one that can reuse code from libav, or introduce new code that can be shared, would probably be a good thing. I haven’t looked into it but I wouldn’t be surprised if libdvbpsi could follow the same treatment.

Finally, another project that could sound cool for libav would be to create a library, API and ABI compatible with xine, that only uses libav. I’m pretty sure that if most of the internals of xine are dropped (including the configuration file and the plugin system), it would be possible to have a shallow wrapper around libav instead of having a full blown project. It might lose support for some files, such as modules, and DVDs, but it would probably be a nice proof of concept and would show what we still need .. and the moment when we can deal with those formats straight into libav, we know we have something better than simply libav.

On a similar note, one of the last things I’ve worked on, in xine, was the “audio out conversion branch”, see for instance this very old post — it is no more no less than what we now is libavresample, just done much worse. Indeed, libavresample actually has the SIMD-optimized routines I never found out how to actually write, which makes it much nicer. Since xine, at the moment I left it, was actually quite nicely using libavutil already, it might be interesting to see what happens if all the audio conversion code is killed, and replaced with libavresample.

So these are my suggestions for this season of GSoC, at least for the projects I’m involved on… maybe I’ll even have time to mentor them this year, as with a bit of luck I’ll have stable employment when the time comes for this to happen (more on this to follow, but not yet).

A story of bad suggestions

You might have noticed that my blog has been down for a little while today. The reason that happened is that I was trying to get Google Webmaster Tools to work again, as I’ve been spending some more time lately to clean up my web presence — I’ll soon post more about news related to Autotools Mythbuster and the direction it’s going to take.

How did that cause my blog’s not coming up though? Well, the new default for GWT’s validation of the authorship of the website is to use DNS TXT records, instead of the old header on the homepage, or file on the root. Unfortunately, it doesn’t work as well.

First, it actually tends to be smart, by checking whose DNS servers are assigned to the domain — which meant that it showed up instructions on how to login on my OVH account (great). On the other hand, it told me to create the new TXT record without setting a subdomain — too bad that it will not accept a validation on flameeyes.eu for blog.flameeyes.eu.

The other problem is that the moment I added a TXT record for blog.flameeyes.eu, the resolution of the host didn’t lead to the CNAME anymore, which meant that the host was unreachable altogether. I’ve not checked the DNS documentation to learn whether this is a bug in OVH or if the GWT suggestion is completely broken. In either case it was a bad suggestion.

Also, if you happen to not be able to reach posts and you end up always on the homepage, please flush your cache, I made a mess when I was fixing the redirects to fix more links all over the Internet — it should all be fine now, and links should all work, even those that were mangled beforehand due to non-ASCII-compatible URLs.

Finally, I’ve updated the few posts were a YouTube video was linked, and they now use the iframe-based embed strategy, which means they are visible without using Adobe Flash, via HTML5. But that’s all fine, no issue should be created by that.

Why I don’t trust most fully automatic CI systems

Some people complain that I should let the tinderbox work by itself and either open bugs or mail the maintainers when a failure is encountered. They say that it should make it faster for the bugs to be reported and so on. I resist the idea.

While it does take time, most of the error logs I see in the tinderbox are not reported as bug. They can be either the same bug happening over and over and over again ­– like in the case of openldap lately, which fails to build with gnutls 3 – or they can be known/false positives, or they simply might not be due to the current package but something else that broke, like if a KDE library is broken because one of its dependencies changed ABI.

I had a bad encounter with this kind of CI systems when, for a short while, I worked on the ChromiumOS project. When you commit anything, a long list of buildbots pick up the commits and validate them against their pre-defined configurations. Some of these bots are public, in the sense that you can check their configuration and see the build logs, but a number of them are private and only available on Google’s network (which means you either have to be at a Google office or connect through a VPN).

I wasn’t at a Google office, and I had no VPN. From time to time one of my commits would cause the buildbots to fail and then I had to look up the failure; I think more or less half the time, the problem wasn’t a problem at all but rather one of the bots going crazy, or breaking on another commit that wasn’t flagged. The big bother was that many times the problem appeared in the private buildbots, which meant that I had no way to check the log for myself. Worse still, it would close the tree making it impossible to commit anything else than a fix for that breakage… which, even if there was one, I couldn’t do simply because I couldn’t see the log.

Now when this happened, my routine was relatively easy, but a time waster: I’d have to pop in the IRC channel (I was usually around already), and ask if somebody from the office was around; this was not always easy because I don’t remember anybody in the CEST timezone to have access to the private build logs at the time, at least not on IRC; most where from California, New York or Australia. Then if the person who was around didn’t know me at least by name, they’d explain to me how to reach the link to the build log… to which I had to reply that I have the link, but the hostname is not public, then I’d have to explain that no, I didn’t have access to the VPN….

I think in the nine months I spent working on the project, my time ended up being mostly wasted on trying to track people down, either asking them to fetch the logs for me, review my changes, or simply “why did you do this at the time? Is it still needed?”. Add to this the time spent waiting for the tree to come “green” again so I could push my changes (which often times meant waiting for somebody in California to wake up, making half my European day useless), and the fact that I had no way to test most of the hardware-dependent code on real hardware because they wouldn’t ship me anything in Europe, and you can probably see why both I didn’t want to blog about it while I was at it and why I haven’t continued longer than I did.

Now how does this relate to me ranting about CI today? Well, yesterday while I was working on porting as many Ruby packages as possible to the new testing recipes for RSpec and Cucumber, I found a failure in Bundler — at first I thought about just disabling the tests if not using userpriv, but then I reconsidered and wrote a patch so that the tests don’t fail, and I submitted it upstream — it’s the right thing to do, no?

Well, it turns out that Bundler uses Travis CI for testing all the pull requests and – ‘lo and behold! – it reported my pull request as failing! “What? I did check it twice, with four Ruby implementations, it took me an hour to do that!” So I look into the failure log and I see that the reported error is an exception that is telling Travis that VirtualBox is being turned off. Of course the CI system doesn’t come back at you to say “Sorry, I had a {male chicken}-up”. So I had to comment myself showing that the pull request is actually not at fault, hoping that now upstream will accept it.

Hopefully, after relating my experiences, you can tell why the tinderbox still applies a manual filing approach, and why I prefer spending time to review the logs instead of spending time to attach them.

A story of a Registry, an advertiser, and an unused domain

This is a post that relates to one of my dayjobs, and has nothing to do with Free Software, yet it is technical. If you’re not interested in non-Free Software related posts, you’re suggested to skip this altogether. If you still care about technical matters, read on!

Do you remember that customer of mine that almost never pays me in time, for which I work basically free of charge, and yet gives me huge headaches from time to time with requests that make little to no sense? Okay you probably remember by now, or you simply don’t care.

Two years or so ago, that customer calls me up one morning asking me to register a couple of second-level domains in as many TLDs as I thought it made sense to, so that they could set up a new web-end to the business. Said project still hasn’t delivered, mostly because the original estimate I sent the customer was considered unreasonably expensive, and taking “too much time” — like they haven’t spent about the same already, and my nine months estimate sounds positively short when you compare it with the over two years gestation the project is lingering on. At any rate, this is of no importance to what I want to focus on here.

Since that day, one set of domains was left to expire as it wasn’t as catchy as it sounded at first, and only the second set was kept registered. I have been paid for the registration of course, while the domains have been left parked for the time being (no they decided not to forward them to the main domain of the business where the address, email and phone number are).

The other day I was trying to find a way to recover a bit more money out of this customer and, incidentally, this blog, and I decided to register to AdSense again, this time with my VAT ID as I have to declare eventual profits coming from that venue. One of the nice features of AdSense allows to “monetize” (gosh how much I hate that word!) parked domains. Since these are by all means parked domains, I gave it a chance.

Four are the domains parked this way: .net, .com, .eu and .it. All registered with OVH – which incidentally has fixed its IPv6 troubles – and up to now all pointing to a blackhole redirect. How do you assign a parked domain to Google’s AdSense service? Well, it’s actually easy: you just have to point the nameservers for the domain to the four provided by Google, and you’re set. On three out of four of the TLDs I had to deal with.

After setting it up on Friday, as of Monday, Google still wouldn’t verify the .it domain; OVH was showing the task alternatively as “processing” and “completed” depending on whether I looked at the NS settings (they knew they had a request to change them) or at the task’s status page (as it’ll be apparent in a moment, it was indeed cloesd). I called them — reason I like OVH: I can get somebody on the phone to eat least listen to me.

What happens? Well, looks like Registro.it – already NIC-IT, the Italian Registration Authority – is once again quite strict in what it accepts. It was just two years ago that they stopped requiring you to fax an agreement to actually be able to register a .it domain, and as of last year you still had to do the same when transferring the domain. Luckily they stopped requiring both, and this year I was able to transfer a domain in the matter of a week or so. But what about this time?

Well, it turns out that the NIC validates the new nameservers when you want to change them, to make sure that the new servers list the domain, and configure it properly. This is common procedure, and both the OVH staff and me were aware of this. What we weren’t aware of (OVH staffers had no clue about this either, they had to call NIC-IT to see what the trouble was, they weren’t informed properly either) is the method they do that: using dig +ANY.

Okay, it’s nothing surprising actually, dig +ANY is the standard way to check for a domain’s zone at a name server… but turns out that ns1.googleghs.com and its brothers – the nameservers you need to point a domain to, for use with AdSense – do not support said queries, making them invalid in the eyes of NIC-IT. Ain’t that lovely? The OVH staffer I spoke with said they’ll inform NIC-IT about the situation, but they don’t count on them changing their ways and … I actually have to say that I can’t blame them. Indeed I don’t see the reason why Google’s DNS might ignore ANY queries.

For my part, I told them that I would try to open a support request with Google to see if they intend to rectify the situation. The issue here is that, as much as I spent trying to find that out, I can’t seem to find a place where to open a ticket for the Google AdSense staff to read. I tried tweeting to their account, but it seems like it didn’t make much sense.

Luckily there is an alternative when you can’t simply set up the domain to point to Google’s DNS, and that is to create a custom zone, which is what I’ve done now. It’s not much of a problem, but it’s still bothersome that one of Google’s most prominent services is incompatible with a first-world Registration Authority such as NIC-IT.

Oh well.

Hunting for a SSL certificate

So, in the previous chapter of my personal current odyssey I noted that I was looking into SSL certificates; last time I wrote something about it I was looking into using CACert to provide a few certificates around. But CACert has one nasty issue for me: not only it’s not supported out of the box by any browser, but also I have failed up to now to find a way to get Chromium (my browser of choice) to accept it, which doesn’t make it better than the self-signed certificates for most of my aims.

Now, back at that time, Owen suggested me to look into StartSSL which is supported out of the box by most if not all the drivers out there, and supports free Class 1 certificates. Unfortunately Class 1 certificates don’t allow for SNI or wildcard certificates, which I would have liked to have, as I have a number of vhosts on this server. On the other hand, the Class 2 (which does provide that kind of information) has an affordable price ($50), so I wouldn’t have minded confirming my personal details to achieve that. The problem is that to get the validation, I need to send a scan of two IDs with a photo, and I only got one. I guess I’ll finally have to get a passport.

As a positive note for them, StartSSL actually replied to my tweet-rant suggesting I could use my birth certificate as secondary ID for validation. I guess this is easier to procure in the United States – at least judging from the kind of reverence Americans have of them – here I’d sincerely like to not bother going to look for it, especially because, as it is, my birth certificate does not report my full name directly (I legally changed it a few years ago if you remember), but as an amendment.

There are, though, a few other problems that shown up while using StartSSL; the first problem is that it doesn’t allow you to use Chrome (or Chromium) to handle registration because of troubles with client-side certificates. Another problem is that the verification for domain access is not based on the DNS hosting, but just on mail addresses: you verify the domain foo by receiving an email directed to webmaster@foo (or other email addresses, both standard and taken from the domain’s WhoIs record). While it’s relatively secure, it only works if the domain can receive email, and only seem to work to verify second level domains.

Using the kind of verification that Google uses to verify domains would make it much nicer to verify domain ownership, and works with subdomains as well as domains that lack email entirely. For those who don’t know how the Google domain verification works, they provide you with the name of a CNAME you have to add to your domain and point it to “google.com”; since the CNAME they tell you to set up is created with a hash of your account name and the domain itself, they can ensure that you have access to the domain configuration and thus to the domain itself. I guess the problem here is just that it takes much more time for DNS to propagate than it takes an email to arrive, and have a fast way to create a new certificate is definitely a good thing of StartSSL.

At any rate, I got a couple of certificates this way, so I finally don’t get Chrome’s warnings because of invalid certificates when I access this computer’s Transmission web interface (which I secure through an Apache reverse proxy). And I also took the time to finally secure xine’s Bugzilla with an SSL connection and certificate.

Thanks Owen, thanks StartSSL!

So, wasn’t HTML5 supposed to make me Flash-free?

Just like Multimedia Mike, I have been quite sceptic regarding seeing HTML5 as a saviour of the open web. Not only because I dislike Ogg to a passion after having tried to parse it myself without the help of libogg (don’t get me started), but because I can pragmatically expect a huge number of problems related to serve multiple video files variant depending on browser and operating system. Lacking common ground, it’s generally a bad situation.

But I have been hoping that Google’s commitment to support HTML5 video, especially in Youtube, would have given me a mostly Flash-free environment; unfortunately that doesn’t seem to be the case. There is a post on the Youtube API blog from last month that tries to explain users why they are still required to use Flash. On the other hand, it has the sour taste that reminds me of Microsoft’s boasting about Windows Genuine Advantage. I guess that notes such as these:

Without content protection, we would not be able to offer videos like this .

to land me on a page that says at the top “This rental is currently unavailable in your country.” without any further notice, and without a warning that Your Mileage May Vary, makes it very likely to have a mixed feeling about a post like that.

Now, from that same post, I got the feeling that for now Google is not planning on supporting embedded Youtube using HTML5, and relied entirely on Flash for that:

Flash Player’s ability to combine application code and resources into a secure, efficient package has been instrumental in allowing YouTube videos to be embedded in other web sites. Web site owners need to ensure that embedded content is not able to access private user information on the containing page, and we need to ensure that our video player logic travels with the video (for features like captions, annotations, and advertising). While HTML5 adds sandboxing and message-passing functionality, Flash is the only mechanism most web sites allow for embedded content from other sites.

Very unfortunate, given that a number of website, including one of a friend of mine actually use Youtube to embed some videos; even my blog has a post using it. It’s still a shame, because it’s a loss, for Google, of the iPad users.. or is it, at all? I have played around a minute with an iPad at the local Mediaworld (Mediamarkt) last week. And I looked at my friend’s website with it. The videos load perfectly using HTML5 I guess, given that it does not support Flash at all.

So what’s the trick? Does Google provide HTML5-enabled embedded videos when it detects the iPhoneOS/iOS Safari identification in the user-agent? Is it Safari instead to translate the Youtube links into HTML5-compatible links? In the former case, why does it not do that when it detects Chrome/Chromium as well? In the latter, why can’t there be an extension to do the same for Chrome/Chromium?

Once again, my point is that you cannot simply characterize Apple and Google as being absolutely evil and absolutely good; there is no “pureness” in our modern world as it is, and I don’t think that trying to strive for that is going to work at all… extremes are not suited for the human nature, even extreme purity.

In all fairness

I know that Apple got a lot of hate from Free Software developers (and not) for the way they handle their App Store, mostly regarding the difficulty to actually get application approved. I sincerely have no direct experience on the thing, but if I apply what I learnt from Gentoo, the time they might take to get the applications approved sounds quite about right for a thorough verification.

Google on the other hand, was said to take much less time, but by personal experience to search for content on the Android Market, I can only find DVD Jon’s post quite on the line. There are a number of applications that are, if not entirely, on the verge of frauds, that got easily approved.

On the other hand, as soon as Google was found to add to the Froyo terms of services the fact that they reserve the option of remotely killing an application, tons of users cried foul. Just like they did for Apple, that also has the same capability and has been exercising it for applications that were later found not to agree with their terms of services.

*A note here: you might not like the way Apple insists on telling you what you should or should not use. I understand it pretty well, and that’s one of the reasons why I don’t use an iPhone. On the other hand, I don’t think you can say that Apple is doing something evil by doing so. Their platform, their choice; get a different platform for a different choice.*

So there are a number of people who think that Apple’s policy in reviewing application is evil (and Google’s allowing possible frauds is a-ok), and in both cases, the remote killswitch is something nasty and a way for them to censor the content for whatever evil plan they have. That points a black light on both of them, doesn’t it? But Mozilla should be fine, shouldn’t it?

I was sincerely wondering what those people who always find a way to despise “big companies” like Apple and Google at the same time, asking their users to choose “freer” alternatives (often times with worse problems) would think while I was reading Netcraft’s report of the malware addon found on the Mozilla index.

I quote: “Mozilla will be automatically disabling the add-on for anyone who has downloaded and installed it.” So Mozilla has a remote killswitch for extensions? Or how are they achieving this?

And again: “[Mozilla] are currently working on a new security model that will require all add-ons to be code-reviewed before becoming discoverable on addons.mozilla.org.” Which means they are going to do the same thing that Apple and Google already do (we’ll have to wait and see to find out to which degree).

Before people misunderstand me: I have nothing against Mozilla and I think they are on the right track here. I would actually hope for Google to tighten their approval process, even if that means much longer turnaround for new applications to be available. As an user, I’d find it much more reassuring than what we have right now (why half the demo/free versions of various apps want to access my personal data, hmm?).

What I’m trying to say here, is that we should really stop crying foul for any choice that Apple (or Microsoft, or Sony, or whoever) makes, they might have quite good reasons to do so, and we might actually follow their steps (like Mozilla appears to be going to do).

Some personal comments about Google’s WebM

So, finally Google announced what people called for — the release as free software, and free format, of the VP8 codec as developed by On2 (the company that developed VP3 – from which Theora is derived – and that Google acquired a bit of time ago).

Now, Dark Shikari of x264 fame dissected the codec and in part the file format; his words are – not unexpectedly, especially for those who know him – quite harsh, but as Mike put it “This open sourcing event has legs.”

It bothers me, though, that people dismissed Jason’s comments as “biased FUD” from the x264 project. Let’s leave alone the fact that I think developers who insist that other FLOSS projects spread FUD about their own are just paranoid, or are just calling FUD what actually are real concerns.

Sure, nobody is denying that Jason is biased by his work on x264; I’m pretty sure he’s proud of what they have accomplished with that piece of software; on the other hand, his post is actually well-informed, and – speaking as somebody who has been reading his comments for a while – not so negative as people seem to write it off as. Sure he’s repeating that VP8 is not on technical par with H.264, but can you say he’s wrong? I don’t think so, he documented pretty well why he thinks so. He also has quite a bit of negative comments on the encoder code they released, but again that’s nothing strange; especially for the high code quality standard FFmpeg and x264 got us used to.

Some people even went as far as saying that he’s spreading FUD agreeing with MPEG-LA for what concerns the chances that some patents still apply to VP8. Jason, as far as I know, is not a lawyer – and I’d probably challenge any general lawyer to take a look at the specs, the patents, and give a perfect dissection about the chance they apply or not – but I would, in general, accept his doubts. That does not have much to say in all this, I guess.

To put the whole situation under perspective, let’s try to guess what Google’s WebM is all about:

  • getting a better – in the technical meaning of the term – codec than H.264; or
  • getting an acceptable Free codec, sidestepping Theora and compromising with H.264.

Without agreeing on one or the other, there is no way to tell whether WebM is good or not. So I’ll start with dismiss the first option, then. VP8 is not something new, they didn’t develop it in the first year or so after the acquisition of On2; it was in the work for years already, and has more or less the same age as H.264 — it’s easy demonstrated by the fact that Adobe and Sorenson are ready to support it since Day 1; if it was too new that was impossible to do.

Jason points out weaknesses in the format (ignore the encoder software for now!), such as the lack of B-frame, and the lower quality than the highest-possible H.264 options. I don’t expect those comments to come new to the Google people who worked on it (unless they are in denial), most likely, they knew they weren’t going to shoot H.264 down with this, but they accepted the compromise.

He also points out that some of the features are “copied” from H.264; that is most likely true, but there is a catch: while not being a lawyer, I remember reading that if you implement the same algorithm described by a patent but you avoid hitting parts of the claims, you’re not infringing upon it; if that’s the case, then they might have been looking at those patents and explicitly tried to null them out. Also, if patents have a minimum of common sense, once a patent describe an algorithm, patenting an almost identical one shouldn’t be possible; that would cover VP8 if it stays near enough, but not too near, H.264. But this is just pure conjecture on my part based on bits and pieces of information I have read in the past.

Some of the features, like B-frames, that could have greatly improved compression, have been avoided; did they just forget about them? Unlikely; they probably decided that B-frames weren’t something they needed. One likely option is that they wanted to avoid the (known) patent on B-frames, as Jason points out; the other is that they might have simply decided that the extra disk space and bandwidth caused by ignoring B-frames was an acceptable downside to have a format simpler to process on mobile devices in software — because in the immediate future, no phone is going to process this format in hardware.

Both Jason and Mike point out that they definitely are better than Theora; that is more than likely, given that the algorithms had a few more years to be developed. This would actually suggest that Google also didn’t consider Theora good enough for their needs; like most of the multimedia geeks have been saying all along. Similarly, they rejected the idea of using Ogg as container format, while accepting Vorbis; does that tell you something? It does to me: they needed something that actually worked (and yes that’s a post from just shy of three years ago I’m linking to) and not only something that was Free.

I have insisted for a long time that the right Free multimedia container format is Matroska, not Ogg; I speak from the point of view of a developer who fought long with demuxers in xine (because xine does not use libavformat for demuxing, so we have our own demuxers for everything), who actually read through the Ogg specification and was scared. The fact that Matroska parallels most of the QuickTime Format/ISO Media/MP4 features is one very good reason for that. I’m happy to see that Google agrees with me…

Let me comment a bit about their decision to rebrand it and reduce to a subset of features; I have sincerely not looked at the specs for the format, so I have no idea which subset is that; I read they skipped things like subtitles (which sounds strange, given that YouTube does support them), I haven’t read anybody commenting on them doing something in an incompatible way. In general, selecting a subset of the features of another format is a good way to provide easy access to decoders; any decoder able to read the super-set format (Matroska) will work properly with the reduced one. The problem will be in the muxer (encoder) software, though, to make use or not of various features.

The same has been true for the QuickTime format; generally speaking the same demuxer (and muxer) can be shared to deal with QuickTime (.mov) files, Mpeg4 files (.mp4), iTunes-compatible audio and video files (.m4a, .m4v, .m4b), 3GPP files (.3gp) and so on. Unfortunately here you don’t have a super-/sub-set split, but you actually got different dialects of the same format, which are slightly different one from the other. I hope Google will be able to avoid that!

Let me share some anecdotal evidence of problems with these formats, something that really happened to me; you might remember I wrote that a friend of mine directed a movie last year; on the day of the first projection, he exported the movie from Adobe Premiere to Mpeg4 format; then he went to create a DVD with iDVD on his MacBook – please here, no comment on the software choice, not my stuff – but … surprise! The movie was recorded, and exported, in 16:9 (widescreen) ratio, but iDVD was seeing it as 4:3 (square)!

The problem was the digital equivalent of missing anamorphic lens — the widescreen PAL 576i format uses non-square pixels, so together with the size in pixel of the frames, the container file need to describe the ratio of the pixel (16:15 for square, 64:45 for widescreen). The problem start with the various dialects use different “atoms” to encode this information — iDVD is unable to fetch it, the way Adobe writes it. Luckily, FFmpeg saved the day: a 9-second processing with FFmpeg, remuxing the file in iTunes-compatible QuickTime-derived format solved the issue.

This is why with these formats a single, adapting demuxer can be used — but a series of different muxers is needed. As I said, I sure hope Google will not make WebM behave the same way.

Beside that, I’m looking forward to the use of WebM: it would be a nice way to replace the (to me, useless) Theora with something that, even though not perfect, sure comes much nearer. The (lousy) quality of the encoder does not scare me, as Mike again said at some point FFmpeg will get from-scratch decoders and encoders, which will strive for the best results — incidentally, I’d say that x264 is one of the best encoders because it is not proprietary software; proprietary developers tend only to care about having something working; Free Software developers want something working well and something they can be proud of.

Ganbatte, Google!