The second best thing about standards: different implementations

The nice thing about standards is that you have so many to choose from.
Andrew S. Tanenbaum

The quote from Tanenbaum is a classic, something that most developers at some point in their career will have to face. But I’d like to expand on that; taking into consideration Open Standards as well. Most Free Software developers (and, argh, advocates!) will agree that Open Standards are a very good thing; make sure that they are fully documented, and let people develop royalty-free implementations, and you got a win.

Or do you? As the title of this post let you know, there is one further problem, with the standards to choose from: their implementation. I’ve already delved into a number of problems related to standards and their implementation; for instance the KWord vs OpenOffice problem, with the two using (at the time they started boasting OpenDocument support) two completely different, non-interoperable methods to define bullet-lists. And again with the inconsistent SVG implementations that cause the same file to appear in vastly different ways, without even an error reported, with multiple software.

And eBooks are nothing different either; let’s leave alone the problem with formatting them (for instance, O’Reilly books are easily readable, but are actually formatting “randomly” for me, compared to others; or The Dragon Reborn which probably underwent an OCR pass, given that Thom sometimes became Torn). I’ve already ranted about DTBook ebooks but this time I’m seriously pissed.

Let me explain again the whole DTBook problem first, because it provides a basic context for the trouble that follows right now. I have a PRS-505 Sony Reader; when I bought it, it only supported PDFs (sort-of) and Sony’s own BBeB/LRF format. Thankfully, Sony updated the firmware to add support for the ePub format, which is supposedly an open standard and should have a number of working implementations, on various operating systems and hardware devices. Apple’s iPad among others is supposed to read ePub files. So what’s the catch?

Well, first of all, since I called in Apple’s iPad, there is the problem of DRM; ePub by itself does not really define a DRM scheme; O’Reilly does not use any DRM in their electronic media (bless them), and Apple does not support DRM-locked ePub files either (and as far as I know they provide no DRM for their files either, but I don’t have a device to test it myself). On the other hand, most online bookstores, and the devices such as the Sony Reader or Kobo’s eReader, support Adobe’s DRM scheme, technically called ADEPT, but marketed as “Adobe Digital Editions”. Of course, as far as I know at least, there is no open source software that can deal with ADEPT-locked files, although there is code out there that allows you to unlock the files once you fetch your personal encryption key out of an enabled system.

Okay, let’s leave DRM out now, and speak about the format itself; ePub files are ZIP files, not tremendously different from an OpenDocument file.. it actually comes with the same META-INF directory and mimetype file. Within that, you have a series of XML files, with the metadata of the book, the Table of Contents, a filename for the cover file, and the list of files with the actual book’s content. A note here: at least The Dragon Reborn seems to be a corrupted ZIP files for both unzip and the inept script, but is read fine by the Reader Library, and by the the Reader itself.

The content files can be of different formats; the most common case is (X)HTML; which as you might expect is the easiest to support, given the wide range of software rendering HTML out there. But a different format, called DTBook, was designed to support text-to-speech reading of audiobooks. Files can easily be called ePub, even though the actual content is in DTBook, and not supported by most devices and software; neither the Reader nor Calibre support that format, and can’t thus read the copy I bought of The Salmon of Doubt (sigh!).

Something even stranger happened when I bought (with a $2 discount, as this time it worked) Sourcery by Terry Pratchett … I started the series a year or two ago, but rather than getting the books, at the time I got the audiobooks version to get some sleep (I’m still doing the same thing, over an year and a half later… whenever I don’t have my iPod on during the night, I wake up feeling worse than when I went to sleep, because of bad nightmares…).. Sourcery is the only one that I haven’t been able to listen in its entirety since I started (well, I also didn’t listen to Mort and rather read it as eBook already). Unfortunately the downloaded ePub, even though not resulting corrupt for what unzip is concerned, cannot be viewed on the Reader, just like the DTBook version it reports a “Page error”, shows no Table of Content, lists a start and end page of 1.

After un-locking the file with inept; I could load it on Calibre and.. it actually reads fine. So the file is a valid ePub book, why on earth would the Reader not read it at all? Not something I can answer without having access to the sources obviously. Luckily, at least this time I can read my book, since Calibre could process it and create a new ePub copy that the Reader actually seem to load and read.

Alas. I really have nothing else I could possibly say.

Again on procuring eBooks

I know that most of you who read my blog daily don’t care about my toying with eBooks, and only read it for the technical articles; on the other hand, I feel like I can at least talk a bit about that, given that most of my personal life is uninteresting and thus I rarely write of that at all.

Anyway, you might remember I had some trouble finding where to buy eBooks and at the end I settled with – for non-technical books that is – WHSmith and Kobo as they both sell Adobe Digital Edition ePub books. Finding mainstream non-DRM ePub seems to be impossible; maybe only on Apple’s iBooks store, but it still doesn’t warrant me getting an iPad to try — even though, if you have an iPad or iPhone and can tell me whether that’s the case, I’d be curious. Finding a second-hand old-generation iPhone shouldn’t be too expensive and if that can get me access to mainstream non-DRM’d ePubs it might be worth it.

Anyway, the two sites above actually give me enough access that I don’t miss most of what I usually read; indeed, Kobo actually provided me with a few curious readings that I might as well try. Also, even though the Dollar is rising again, buying the books from Kobo is, for me, slightly cheaper than WHSmith.

Also, the fact that they are no simple eBook store makes them more intriguing; I’m not that enticed by their eReader (given I have already my PRS-505 and I’m not going to drop it any time soon), but the fact that they have applications available for a number of platforms (but not Linux, dang it! If they did, and it supported activation of Adobe DRM’d ePubs, they would be so great I could consider getting the eReader if only to fund them further). Even if I will probably not use those, I can still enjoy the fact that they let me read the books I buy on the web with any browser, on my reader in ePub format (and thus anywhere the ePub format can be read!) and since a few days also on my Milestone thanks to their Android application.

A word about the DRM here; while I’m one of those people who, I said already, prefer to abide to restrictions as long as they are an acceptable tradeoff (for instance the audiobooks DRM on iTunes is acceptable because they do cost a lot less than on unencumbered form). While I can understand the reason why most publishers won’t even consider not using DRM on the files, and I accept that at least this way I can get eBooks at all, I don’t think the tradeoff is useful to the user in this case. Indeed, given the fact that not all devices using ePub supports Adobe Digital Editions, it can be quite harsh to have it applied. add that to the not all ePubs are the same and thus you might have to access the content of the archive to change it into something usable, and you get the picture. Luckily, the ADEPT DRM has been long broken so it’s not difficult to get clean files.

Anyway, as I said, Kobo looks a nice choice to me because of the presence of the additional applications (just to put it into perspective, while I’m not considering buying an iPad, were I to, I could still read the books I bought from Kobo, without going around the DRM, as they have an iPad application); for instance I could easily read The Salmon of Doubt from my browser, even though the ePub version uses the infamous DTBook format above. Unfortunately they don’t have *everything*… not yet at least.

Anyway, last night I didn’t sleep so I could finish reading Assassin’s Apprentice (somebody suggested this to me a few years back; on the other hand I decided to read this because me and some friends were to a fair where also the author was…). Nice book indeed, just a bit “slow” (took me almost a month to read it fully, and it was just 400 pages). Next step, though, I wanted to come back to Dresden’s Files; Butcher’s style is enchanting. Three books out, I was up to read Summer KnightGrave Peril I got from Kobo so I assumed they had the next as well; somehow, they don’t. So at the end I got it from WHSmith; it bears little difference, but it still strikes me as odd.

And in all this, there seems to be no shop for Italian eBooks; sigh. If only ChiareLettere had ePubs available.. their books are quite bulky and I would love to give them away and trading them for digital copies of them. I wonder if I should get more (technical) skills about this kind of publishing and propose to handle that kind of stuff myself. I would also know where to start, maybe.

Cooling down about eBooks excitement

So I have written a few posts regarding eBooks in the past month or so, since I finally went to use my Sony eReader full time. Unfortunately, it failed for me yesterday, on the train back from Milan – where I was with a friend to show off his game – as I wanted to read The Salmon of Doubt which I bought from Kobo at the start of the month.

It failed me with a quite unimpressive “page error” so I thought the file was corrupted on the Memory Stick (or even the Memory Stick started to fail — they are not eternal, and this one has been passed down from a friend of mine to me for PSPs, and is now in the Reader, since digital distribution of PSP games called for something bigger than 1GB). I uploaded it to the Reader anew, and it still failed; I then decided to convert it with Calibre but it also failed (although, at least giving me an idea about what the problem was in the first place!).

The problem, as it turns out, is that the ePub specification is, like ODT, SVG and MP4/ISO Media, a specification that includes so much more than any single implementation will ever support. One issue that lately has been noted by many is that Apple’s iBooks application for the iPad, which supports ePub books, surprisingly does not support DRM’d files (well, at least not those DRM’d with Adobe Digital Editions), but it’s not the only one. In this case, while the Sony Reader supports Adobe Digital Editions files, it does not support DTBook files. And that is what my ePub file is, deep within.

Now, there are tools that supposedly convert one format to the other, yet they don’t seem to do that much of a good result out of it, so I wasn’t able to get it to appear properly just yet. And this also requires me to tinker quite a bit with the raw files I don’t know a thing about.

This starts to make me wary about eBooks… one out of fifteen up to now doesn’t spell trouble, but there are cases where it might not be so good to have them around. Add the fact that there is basically no content I could find in Italian as eBook, and I start to get afraid I can only partly replace dead-tree books for a long time still. Sucks!