eBook distribution woes

So in my previous post I noted that Lulu published my eBook immediately and it has been for sale right away, while Kobo and Google Books are still reviewing it (even as I write this!). Turns out that this is the only positive side with Lulu.

When I first added the book, I noted that they have the option to revise a project (book) so that you can update/change it later. I assumed that buyers would always get the latest revision when downloading, even though they bought it at an earlier date (which is the case for Amazon, nowadays) — turns out that is not the case. Luckily, only three people bought the book from Lulu so I was able to reach them and provide them with the updated ePub (and I’ll keep sending them the updated revisions).

But I have more to say about Lulu — their support sucks. First of all, most of the documentation for publisher they have on their website is a support forum, which means that unless somebody already asked about something similar, you can’t find it. When you send them an inquiry by email (with a website form of course, there’s no email address on the website, and finding a “Contact Us” page is not possible), they first send you a list of FAQs that have nothing to do with your inquiry — and if you don’t reply to that with “no this did not solve my issue, get me a real person”, they don’t even look into what you wrote! They even send you a survey asking how’s the support!

Now let me spend a few nice words for Amazon instead. First of all, even though they did a copyright check, publishing the book took less than half the week. Updates? Less than 24 hours! And they do let you download a new copy of the book with the updates. But that’s just the basic level. Somehow, searching Amazon.com (only the .com version) for either the title “Autotools Mythbuster” or for my name, does not show you my book at all — in the latter case, you actually find Bart’s book on Munin which I’m listed as a technical reviewer for. When I registered for the Author page it also didn’t show my book — it works fine on the UK page on the other hand.

When I contacted Amazon (which can be contacted via either email from the website, or livechat!), not only they gave me a very swift answer, but they followed up on it without feedback on my side, noting the problem is not fixed, and that they’ll escalate it to the technical department. I’ll be waiting for a solution — in the mean time the book is still available if you have a Kindle or a device compatible with the Kindle reading application (Android tablets are fine).

The situation now though, is that I lack a distributor for the clear, DRM-free ePub version — as I said I’m waiting for Kobo and Google Books, hoping that it’ll actually allow work in progress books to be published. I’ve been suggested to look into Leanpub but it does not allow me to provide an ePub already generated, it’s to write new content — and similarly for FastPencil which I had to register just to be able to find out what it does.

So if you have any suggestions, please let me know, as I’d like to be able to reach even the people who want to stay away from Amazon as much as they can.

The issue with the split HTML/XHTML serialization

Not everybody knows that HTML 5 has been released in two flavours: HTML 5 proper, which uses the old serialization, similarly to HTML 4, and what is often incorrectly called XHTML 5 which uses XML serialization, like XHTML and XHTML 1.1 did. The two serializations have different grades of strictness, and the browsers deal witht hem that way.

It so happens that the default output on DocBook for XHTML 1 is compatible with the HTML serialization, which means that even if the files have a .html extension, locally, they will load correctly in Chrome, for instance. The same can’t be said to XHTML 1.1 or XHTML5 output; one particularly nasty problem is that the generated code will output XML-style tags such as <a id="foo" /> which throw off the browsers entirely, unless properly loaded as XHTML … and on the other hand, IE still has trouble when served properly-typed XHTML (i.e. you have to serve it as application/xml rather than application/xhtml+xml).

So I have two choices: redirect all the .html requests to .xhtml, make it use XHTML 5 and work around the IE8 (and earlier) limitations, or I can forget about XHTML 5 at all. This starts to get tricky! So for the moment I decided to not go with XHTML 5, and at the same time I’m going to keep building ePub 2 books, and publish them as they are, instead of using ePub 3 (even though, as I said, O’Reilly got it working for their workflow).

Unfortunately even if I went through that on the server side to fix it, that wouldn’t even be enough alone! I would have to also change the CSS, since many things that were always <div> before, are now using proper semantic types, including <section> (with the exception of the table of contents on the first landing page, obviously (damn). This actually makes it easier in one way as it lets me drop the stupid nth-child CSS3 trick I used to set the style of the main div, compared to the header and footer. Hopefully this should let me fix the nasty IE 3 style beveled border that Chrome put around the Flattr button when using XHTML 5.

In the mean time I have a few general fixes to the style, now I just need to wait for the cover image to come from my designer friend, and then I can update both the website and the eBook versions all around the stores.

To close the post.. David you deserve a public apology: while you were listed as <editor> on the DocBook sources before, and the XSL was supposed to emit it on the homepage, for whatever reason, it fails to. I’ve upgrade you to <author> until I can find why the XSL is misbehaving so I can fix it properly.

In the mean time, tomorrow I’ll write a few more words about automake and then

The odyssey of making an eBook

Please note, if you’re reading this post on Gentoo Universe, that this blog is syndicated in its full English content; including posts like this which is, at this point, the status of a project that I have to call commercial. So don’t complain that you read this on “official Gentoo website” as Universe is quite far from being an official website. I could understand the complaint if it was posted on Planet Gentoo.

I mused last week about the possibility of publishing Autotools Mythbuster as an eBook — after posting the article I decided to look into which options I had for self-publishing, and, long story short, I ended up putting it for sale on Amazon and on Lulu (which nowadays handles eBooks as well). I’ve actually sent it to Kobo and Google Play as well, but they haven’t finished publishing it yet; Lulu is also taking care of iBooks and Barnes & Nobles.

So let’s first get the question out of the way: the pricing of the eBook has been set to $4.99 (or equivalent) on all stores; some stores apply extra taxes (Google Play would apply 23% VAT in most European countries; books are usually 4% VAT here in Italy, but eBooks are not!), and I’ve been told already that at least from Netherlands and Czech Republic, the Kindle edition almost doubles in price — that is suboptimal for both me and you all, as when that happens, my share is reduced from 70 to 35% (after expenses of course).

Much more interesting than this is, though, the technical aspect of publishing the guide as an eBook. The DocBook Stylesheets I’ve been using (app-text/docbook-xsl-ns-stylesheets) provide two ways to build an ePub file: one is through a pure XSLT that bases itself off the XHTML5 output, and only creates the file (leaving to the user to zip them up), the other is a one-call-everything-done through a Ruby script. The two options produce quite different files, respectively in ePub 3 and ePub 2 format. While it’s possible to produce an ePub 3 book that is compatible with older readers, as an interesting post from O’Reilly delineates, but doing so with the standard DocBook chain is not really possible, which is a bummer.

At the end, while my original build was with ePub 3 (which was fine for both Amazon and Google Play), I had to re-build it again for Lulu which requires ePub 2 — it might be worth noting that Lulu says that it’s because their partners, iBookstore and Nook store, would refuse the invalid file, as they check the file with epubcheck version 1… but as O’Reilly says, iBooks is one of the best implementation of ePub 3, so it’s mostly an artificial limitation, most likely caused by their toolchain or BN’s. At the end, I think from the next update forward I’ll stick with ePub 2 for a little while more.

On the other hand, getting these two to work also got me to have a working upgrade path to XHTML 5, which failed for me last time. The method I’ve been using to know exactly which chapters and sections to break on their own pages on the output, was the manual explicit chunking through the chunk.toc file — this is not available for XHTML5, but it turns out there is a nicer method by just including the processing instructions in the main DocBook files, which works with both the old XHTML1 and the new XHTML5 output, as well as ePub 2 and ePub 3. While the version of the stylesheet that generated the website last is not using XHTML5 yet, it will soon do that, as I’m working on a few more changes (among which the overdue Credits section).

One of the thing that I had to be more careful with, with ePub 2, were the “dangling links” to sections I planned but haven’t written yet. There are a few in both the website and the Kindle editions, but they are gone for the Lulu (and Kobo, whenever they’ll make it available) editions. I’ve been working a lot last week to fill in these blanks, and extend the sections, especially for what concerns libtool and pkg-config. This week I’ll work a bit more on the presentation as well, since I still lack a real cover (which is important for eBook at least), and there are a few things to fix on the published XHTML stylesheet as well. Hopefully, before next week there will be a new update for both website and ebooks that will cover most of this, and more.

The final word has to clarify one thing: both Amazon and Google Books put the review on hold the moment when they found the content available already online (mostly on my website and at Gitorious), and asked me to confirm how that was possible. Amazon unlocked the review just a moment later, and published by the next day; Google is still processing the book (maybe it’ll be easier when I’ll make the update and it’ll be an ePub 2 everywhere, with the same exact content and a cover!). It doesn’t seem to me like Lulu is doing anything like that, but it might just have noticed that the content is published on the same domain as the email address I was registered with, who knows?

Anyway to finish it off, once again, the eBook version is available at Amazon and Lulu — both versions will come with free update: I know Amazon allows me to update it on the fly and just require a re-download from their pages (or devices), I’ll try to get them to notify the buyers, otherwise it’ll just be notifying people here. Lulu also allows me to revise a book, but I have no idea whether they will warn the buyers and whether they’ll provide the update.. but if that’s not the case, just contact me with the Lulu order identifier and I’ll set up so that you get the updates.

The future of Autotools Mythbuster

You might have noticed after yesterday’s post that I have done a lot of visual changes to Autotools Mythbuster over the weekend. The new style is just a bunch of changes over the previous one (even though I also made use of sass to make the stylesheet smaller), and for the most part is to give it something recognizable.

I need to spend another day or two working on the content itself at the very least, as the automake 1.13 porting notes are still not correct, due to further changes done on Automake side (more on this in a future post, as it’s a topic of its own). I’m also thinking about taking a few days off Gentoo Linux maintenance, Munin development, and other tasks, and just work on the content on all the non-work time, as it could use some documentation of install and uninstall procedures for instance.

But leaving the content side alone, let me address a different point first. More and more people lately have been asking for a way to have the guide available offline, either as ebook (ePub or PDF) or packaged. Indeed I was asked by somebody if I could drop the NonCommercial part of the license so that it can be packaged in Debian (at some point I was actually asked why I’m not contributing this to the main manuals; the reason is that I really don’t like the GFDL, and furthermore I’m not contributing to automake proper because copyright assignment is becoming a burden in my view).

There’s an important note here: while you can easily see that I’m not pouring into it the amount of time needed to bring this to book quality, it does take a lot of time to work on it. It’s not just a matter of gluing together the posts that talk about autotools from my blog, it’s a whole lot of editing, which is indeed a whole lot of work. While I do hope that the guide is helpful, as I wrote before, it’s much more work for the most part that I can pour into on my free time, especially in-between jobs like now (and no, I don’t need to find a job — I’m waiting to hear from one, and got a few others lined up if it falls through). While Flattr helps, it seems to be drying up, at least for what concerns my content; even Socialvest is giving me some grief, probably because I’m no longer connecting from the US. Beside that, the only “monetization” (I hate that word) strategy I got for the guide is AdSense – which, I remind you, kicked my blog out for naming an adult website on a post – and making the content available offline would defeat even the very small returns of that.

At this point, I’m really not sure what to do; on one side I’m happy to receive more coverage just because it makes my life easier to have fewer broken build systems around. On the other hand, while not expecting to get rich off it, I would like to know that the time I spend on it is at least partly compensated – token gestures are better than nothing as well – and that precludes a simple availability of the content offline, which is what people at this point are clamoring for.

So let’s look into the issues more deeply: why the NC clause on the guide? Mostly I want to have a way to stop somebody else exploiting my work for gain. If I drop the NC clause, nothing can stop an asshole from picking up the guide, making it available on Amazon, and get the money for it. Is it likely? Maybe not, but it’s something that can happen. Given the kind of sharks that infest Amazon’s self-publishing business, I wouldn’t be surprised. On the other hand, it would probably make it easier for me to accept non-minor contributions and still be able to publish it at some point, maybe even in real paper, so it is not something I’m excluding altogether at this point.

Getting the guide packaged by distributions is also not entirely impossible right now: Gentoo has generally not the same kind of issues as Debian regarding the NC clauses, and since I’m already using Gentoo to build and publish it, making an ebuild for it is tremendously simple. Since the content is also available on Git – right now on Gitorious, but read on – it would be trivial to do. But again, this would be cannibalizing the only compensation I got for the time spent on the guide. Which makes me very doubtful on what to do.

About the sources, there is another issue: while at the time I started all this, Gitorious was handier than GitHub, over time Gitorious interface didn’t improve, while the latter improved a lot, to the point that right now it would be my choice to host the guide: easier pull requests, and easier coverage. On the other hand, I’m not sure if the extra coverage is a good thing, as stated above. Yes, it is already available offline through Gitorious, but GitHub would make it effectively easier to get offline than to consult online. Is that what I want to do? Again, I don’t know.

You probably also remember an older post of mine from one and a half years ago where I discussed the reasons why I haven’t published Autotools Mythbuster at least through Amazon; the main reason was that, at the time, Amazon has no easy way to update the book for the buyers without having them buying a new copy. Luckily, this has changed recently, so the obstacle is actually fallen. With this in mind, I’m considering making it available as a Kindle book for those of you who are interested. To do so I have first to create it as an ePub though — so it would solve the question that I’ve been asked about the eBook availability… but at the same time we’re back to the compensation issue.

Indeed, if I decide to set up ePub generation and start selling it on the Kindle store, I’d be publishing the same routines on the Git repository, making it available to everybody else as well. Are people going to buy the eBook, even if I priced it at $0.99? I’d suppose not. Which brings me to not be sure what the target would be, on the Kindle store: price it down so that the convenience to just buy it from Amazon overweights the work to rolling your own ePub, or googling for a copy, – considering that just one person rolling the ePub can easily make it available to everybody else – or price it at a higher point, say $5, hoping that a few, interested users would fund the improvements? Either bet sounds bad to me honestly, even considering that Calcote’s book is priced at $27 at Amazon (hardcopy) and $35 at O’Reilly (eBook) — obviously, his book is more complete, although it is not a “living” edition like Autotools Mythbuster is.

Basically, I’m not sure what to do at all. And I’m pretty sure that some people (who will comment) will feel disgusted that I’m trying to make money out of this. On the whole, I guess one way to solve the issue is to drop the NC clause, stick it into a Git repository somewhere, maybe keep it running on my website, maybe not, but not waste energy into it anymore… the fact that, with the much more focused topic, it has just 65 flattrs, is probably indication that there is no need for it — which explains why I couldn’t find any publisher interested in making me write a book on the topic before. Too bad.

Texinfo to Kindle, an odissey

This should be my last week in Los Angeles for the moment. Tomorrow Excelsior will be connected directly to the Internet, with its own IP (v4) and an IPv6 tunnel ready. I’ll catch a plane next week to go back to Italy to take care of a few things, while it crunches numbers.

Since I’m expecting long plane rides in my future, I hope to be able to read much more. In particular, I want to finally find the time to learn enough Elisp to write my own Emacs modes. I really miss a decent ActionScript mode while I’m working on Flash code (don’t ask).

So I set myself out to find a way to produce a standard ePub file — from that, converting to a Kindle-compatible Mobi file, is just a matter of using Calibre.

I found this post from one and a half years ago, which describe the situation pretty nicely.. while I’m currently ignoring the issue with the TOC that the author is describing (probably simply because I haven’t been able to load this on my Kindle and judge it yet), I found a different one: makeinfo will generate invalid XML.

The problem lies in the id= attribute of XML, which is tightly specified by the language to have a given format (has to start with only certain characters, and then only a few more are allowed — it can’t start with a number for instance, nor it can contain a slash character). While makeinfo already had a function to (partially) escape an XML id, it wasn’t using it for the docbook output. The function itself, then, wasn’t considering all the escapes, and thus even when calling it, the output would still be invalid, if the texi sources contained non-alphanumeric characters.

So now I have a patch for texinfo which should work; too bad I also have to get a copyright assignment for this as well, and I don’t know if I’ll have to wait till I get home to sign and send it back or not. The important part is having the patch though. I also fixed the issue with setfilename being added to the output when creating docbook.

Then there is another issue: the dbtoepub script. In Gentoo this script is installed by docbook-xsl-stylesheets and docbook-xsl-ns-stylesheets within /usr/share — the problem is that it was never mad easy to execute, and its dependencies weren’t considered. I took the chance of a bump of the stylesheets to add an USE flag for Ruby to the package (the script is written in Ruby) that will either remove the script or also install an executable wrapper so that it can be executed.

Actually, while I was at it, I made sure that the two ebuilds, which install two variants of the same basic content, will be almost identical just changing the directory where the content is installed, and making the remaining changes happen depending on $PN (an exception being the keywords as the namespaced version is not used so much, it’s just me liking them most of the time).

After I got the epub file, it was time to make sure it was complying with the specs; I’ve been burnt before by invalid or simply non-standard epub files. Luckily, Adobe of all people released an open source (BSD-licensed) tool to audit the files; epubcheck version 1.1 is now in tree as app-text/epubcheck. I’m hoping somebody who knows more Java than me can get a new version of jing in tree so I can bring epubcheck 3 into the tree — they use a quite newer one than is available right now, and that’s bad. The new version is designed to support the new version of the epub standard (which is supported by the 1.77.0 release of the stylesheets as well, and should be relatively easy to use even without Ruby), so I’m fine with version 1.1 right now.

Anyway all the tools I’ve been using should now be in tree (I’m testing the texinfo patch as we speak), and soon enough I should be able to start reading that manual on my Kindle.. expect some Emacs modes from me, afterwards.

The misery of the ePub format

I often assume that most of the people reading my blog have been reading it for a long enough time that they know a few of my quirks, one of which is my “passion” for digital distribution, and in particular my liking of eBooks over printed books. This passion actually stems from the fact I’d like to be able to move out of my current home soonish, and the least “physical” stuff I have to bring with me, the better.

I started buying eBooks back in 2010, when I discovered my Sony Reader PRS-505 (originally only capable of reading Sony’s own format) was updated to be able to read the “standard” ePub format, protected with Adobe’s Digital Editions DRM (ADEPT). One of my first suppliers for books in that format was WHSmith, the British bookstore chain. At the end I bought six books from them: Richard Dawkins’s The God Delusion, Nick Harkaway’s The Gone-Away World (which I read already, but wanted a digital copy of, after giving away my hardcopy to a friend of mine), and four books of The Dresden Files.

After a while, I had to look at other suppliers for a very simple reason: WHSmith started requiring me a valid UK post code, which I obviously don’t have. I then moved on to Kobo since they seemed to have a more varied selection, and weren’t tied to the geographical distribution of the UK vendor.

Here I got one of my first disappointments with the ePub “standard”: one of the books I bought from Kobo earlier, Douglas Adams’s The Salmon of Doubt I still haven’t been able to read!

(I really wish Kobo at least replaced the book on their catalogue, since even their applications can’t read it, or otherwise I would like some store credit to get a different book, since that one can’t be read with their own applications.)

Over time, I came to understand that the ePub specifications are just too lax to make it a good format: there are a number of ePub files that are broken simply because the ZIP file is “corrupted” (the names within the records don’t match); a few required me to re-package them to be readable by the Reader; and a few more are huge just because they decide to get their own copy of DejaVu font family in the zip file itself. Of course to fix any of these issues, you also have to take the DRM out of the picture, which is luckily very easy for this format.

Today, Kobo is once again the protagonist of a disappointment, a huge one, in terms of digital distribution; together with WHSmith. But first let’s take a step back to last week.

While in the United States with Luca, I got my hands on a Kindle (the version with keyboard); why? Well, from one side I was starting to be irked by the long list of issues I noted earlier about ePub books, but on the other hand, a few books such as Ian Fleming’s classic Bond novels were not available on Kobo or other ePub suppliers, while they were readily available on Amazon… plus a few of the books I could find on both Kobo and Amazon were slightly cheaper on the latter. I already started reading Fleming’s novels on the iPad through Amazon’s app, but I don’t like reading on a standard LCD.

Coming back home, we passed through London Heathrow; Luca went to look for a book to read on the way home, and we went to the WHBook shop there… and I was surprised to see it was now selling Kobo’s own reader device (the last WHSmith shop I was at, a couple of years ago, was selling Sony exclusively). This sounded strange, considering that WHSmith and Kobo were rivals, for me in particular but in general as well.

I wasn’t that far off, when I smelled something fishy; indeed, tonight I received a mail from WHSmith telling me they joined forces with Kobo, and that they will no longer supply eBooks on their webshop. The format being what it is, if they no longer kept the shop, you’d be found without a way to re-download or eBooks, which is why it is important for a digital distributor to be solid for me.. turns out that WHSmith is not as solid as I supposed. So they suggest you to make an account at Kobo (unless you have one already, like I did) so that they can transfer your books on that account.

Lovely! For me that was very good news, since having the books on my Kobo account means not only being able to access them as ePub (which I had already), but also that I could read them on their apps for Android and iPad, as well as on their own website (very Amazon-y of them). Unfortunately there is a problem: out of the six books I bought at WHSmith, they only let me transfer… two!

Seems like that, even though WHSmith decided to give (or sell) its customers to Kobo, as well as leaving them to provide their ebook offering instead, their partnership does not cover the distribution rights of the books they used to sell. This means that for instance the four Dresden Files novels I bought from WHSmith, that were being edited, even digitally, by the British publisher, are not available to the Canadian store Kobo, who only list the original RoC offerings.

This brings up two whole issues: the first is that unless your supplier is big enough that you can rely on it to exists forever, you shouldn’t trust DRM; luckily for me on the ePub side the DRM is so shallow that I don’t really care for its presence, and on the other hand I foresee Amazon’s DRM to be broken way before they start to crumble. The second issue is that even in the market of digital distribution, which is naturally a worldwide, global market, the regional limitations are making it difficult to have fair offerings; again Amazon seems to sidestep this issue, as it appears to me like there is no book available only on one region in their Kindle offerings: the Italian Kindle store covers all the American books as well.

Why Autotools Mythbuster is not a published ebook

I have already expressed that my time lately is so limited that there is no hope for me to catch a breath (today I’m doing triple shifts to be able to find the time to read Ghost Story the latest in The Dresden Files novels’ series, that was released today, oh boy do I love eBooks?). So you might probably understand why even Autotools Mythbuster hasn’t seen much improvement over the past month and something.

But I have considered its future during this time. My original idea of trying to write this down for a real publisher was shot down: the only publisher who expressed something more than “no interest at all”, was the one which had already a book on queue on the topic. The second option, that was to work on it during spare time, finding the time through donations covering the time spent on the task. This also didn’t fly much, if at all.

One suggestion I’ve been given was to make the content available in print – for instance through lulu – or as a more convenient offline read, as a properly formatted ebook. Unfortunately, this seems to be overly complex for very little gain. First of all, the main point of doing this should have been to give it enough visibility and get back some money for the time spent on writing it, so simply adding PDF and ePub generation rules to the guide wouldn’t be much of an improvement.

The two obvious solutions were, as noted, lulu, and on the other hand Amazon’s Kindle Store. The former, though, is a bit complex because any print edition would just be a snapshot of the guide at some point in time, not complete and just an early release, at any point in time. While it would probably still get me something, I don’t think it is “right” for me to propose such an option. I originally hoped for the Kindle Store to be more profitable and still ethic, but read on.

While there are some technical issues with producing a decent ePub out of a DocBook “book” – even O’Reilly isn’t getting something perfect out of their ePubs, both when read on the Sony Reader and with iPad’s iBooks – that isn’t the main issue with the plan. The problem is that Amazon seems to make Kindle e-books much more similar to print books than we’d like to.

While as an author you can update the content of your book, to replace it with an updated version with more, corrected content, the only way for the buyer to get the new content is to pay again in full for the item. I can probably guess that this was done likely on purpose and almost as likely with at least partially with an intent to protect the consumer from the producer who might replace the content of any writing without the former’s intervention, but this is causing major pain in my planning, which in turn cause this method to not be viable either.

What I am planning on adding is simply a PDF version, with a proper cover (but I need a graphic project for it), and a Flattr QR Code, that can then be read offline. But that’s not going to make the guide any more profitable, which means it won’t get any extra time…

The second best thing about standards: different implementations

The nice thing about standards is that you have so many to choose from.
Andrew S. Tanenbaum

The quote from Tanenbaum is a classic, something that most developers at some point in their career will have to face. But I’d like to expand on that; taking into consideration Open Standards as well. Most Free Software developers (and, argh, advocates!) will agree that Open Standards are a very good thing; make sure that they are fully documented, and let people develop royalty-free implementations, and you got a win.

Or do you? As the title of this post let you know, there is one further problem, with the standards to choose from: their implementation. I’ve already delved into a number of problems related to standards and their implementation; for instance the KWord vs OpenOffice problem, with the two using (at the time they started boasting OpenDocument support) two completely different, non-interoperable methods to define bullet-lists. And again with the inconsistent SVG implementations that cause the same file to appear in vastly different ways, without even an error reported, with multiple software.

And eBooks are nothing different either; let’s leave alone the problem with formatting them (for instance, O’Reilly books are easily readable, but are actually formatting “randomly” for me, compared to others; or The Dragon Reborn which probably underwent an OCR pass, given that Thom sometimes became Torn). I’ve already ranted about DTBook ebooks but this time I’m seriously pissed.

Let me explain again the whole DTBook problem first, because it provides a basic context for the trouble that follows right now. I have a PRS-505 Sony Reader; when I bought it, it only supported PDFs (sort-of) and Sony’s own BBeB/LRF format. Thankfully, Sony updated the firmware to add support for the ePub format, which is supposedly an open standard and should have a number of working implementations, on various operating systems and hardware devices. Apple’s iPad among others is supposed to read ePub files. So what’s the catch?

Well, first of all, since I called in Apple’s iPad, there is the problem of DRM; ePub by itself does not really define a DRM scheme; O’Reilly does not use any DRM in their electronic media (bless them), and Apple does not support DRM-locked ePub files either (and as far as I know they provide no DRM for their files either, but I don’t have a device to test it myself). On the other hand, most online bookstores, and the devices such as the Sony Reader or Kobo’s eReader, support Adobe’s DRM scheme, technically called ADEPT, but marketed as “Adobe Digital Editions”. Of course, as far as I know at least, there is no open source software that can deal with ADEPT-locked files, although there is code out there that allows you to unlock the files once you fetch your personal encryption key out of an enabled system.

Okay, let’s leave DRM out now, and speak about the format itself; ePub files are ZIP files, not tremendously different from an OpenDocument file.. it actually comes with the same META-INF directory and mimetype file. Within that, you have a series of XML files, with the metadata of the book, the Table of Contents, a filename for the cover file, and the list of files with the actual book’s content. A note here: at least The Dragon Reborn seems to be a corrupted ZIP files for both unzip and the inept script, but is read fine by the Reader Library, and by the the Reader itself.

The content files can be of different formats; the most common case is (X)HTML; which as you might expect is the easiest to support, given the wide range of software rendering HTML out there. But a different format, called DTBook, was designed to support text-to-speech reading of audiobooks. Files can easily be called ePub, even though the actual content is in DTBook, and not supported by most devices and software; neither the Reader nor Calibre support that format, and can’t thus read the copy I bought of The Salmon of Doubt (sigh!).

Something even stranger happened when I bought (with a $2 discount, as this time it worked) Sourcery by Terry Pratchett … I started the series a year or two ago, but rather than getting the books, at the time I got the audiobooks version to get some sleep (I’m still doing the same thing, over an year and a half later… whenever I don’t have my iPod on during the night, I wake up feeling worse than when I went to sleep, because of bad nightmares…).. Sourcery is the only one that I haven’t been able to listen in its entirety since I started (well, I also didn’t listen to Mort and rather read it as eBook already). Unfortunately the downloaded ePub, even though not resulting corrupt for what unzip is concerned, cannot be viewed on the Reader, just like the DTBook version it reports a “Page error”, shows no Table of Content, lists a start and end page of 1.

After un-locking the file with inept; I could load it on Calibre and.. it actually reads fine. So the file is a valid ePub book, why on earth would the Reader not read it at all? Not something I can answer without having access to the sources obviously. Luckily, at least this time I can read my book, since Calibre could process it and create a new ePub copy that the Reader actually seem to load and read.

Alas. I really have nothing else I could possibly say.

Again on procuring eBooks

I know that most of you who read my blog daily don’t care about my toying with eBooks, and only read it for the technical articles; on the other hand, I feel like I can at least talk a bit about that, given that most of my personal life is uninteresting and thus I rarely write of that at all.

Anyway, you might remember I had some trouble finding where to buy eBooks and at the end I settled with – for non-technical books that is – WHSmith and Kobo as they both sell Adobe Digital Edition ePub books. Finding mainstream non-DRM ePub seems to be impossible; maybe only on Apple’s iBooks store, but it still doesn’t warrant me getting an iPad to try — even though, if you have an iPad or iPhone and can tell me whether that’s the case, I’d be curious. Finding a second-hand old-generation iPhone shouldn’t be too expensive and if that can get me access to mainstream non-DRM’d ePubs it might be worth it.

Anyway, the two sites above actually give me enough access that I don’t miss most of what I usually read; indeed, Kobo actually provided me with a few curious readings that I might as well try. Also, even though the Dollar is rising again, buying the books from Kobo is, for me, slightly cheaper than WHSmith.

Also, the fact that they are no simple eBook store makes them more intriguing; I’m not that enticed by their eReader (given I have already my PRS-505 and I’m not going to drop it any time soon), but the fact that they have applications available for a number of platforms (but not Linux, dang it! If they did, and it supported activation of Adobe DRM’d ePubs, they would be so great I could consider getting the eReader if only to fund them further). Even if I will probably not use those, I can still enjoy the fact that they let me read the books I buy on the web with any browser, on my reader in ePub format (and thus anywhere the ePub format can be read!) and since a few days also on my Milestone thanks to their Android application.

A word about the DRM here; while I’m one of those people who, I said already, prefer to abide to restrictions as long as they are an acceptable tradeoff (for instance the audiobooks DRM on iTunes is acceptable because they do cost a lot less than on unencumbered form). While I can understand the reason why most publishers won’t even consider not using DRM on the files, and I accept that at least this way I can get eBooks at all, I don’t think the tradeoff is useful to the user in this case. Indeed, given the fact that not all devices using ePub supports Adobe Digital Editions, it can be quite harsh to have it applied. add that to the not all ePubs are the same and thus you might have to access the content of the archive to change it into something usable, and you get the picture. Luckily, the ADEPT DRM has been long broken so it’s not difficult to get clean files.

Anyway, as I said, Kobo looks a nice choice to me because of the presence of the additional applications (just to put it into perspective, while I’m not considering buying an iPad, were I to, I could still read the books I bought from Kobo, without going around the DRM, as they have an iPad application); for instance I could easily read The Salmon of Doubt from my browser, even though the ePub version uses the infamous DTBook format above. Unfortunately they don’t have *everything*… not yet at least.

Anyway, last night I didn’t sleep so I could finish reading Assassin’s Apprentice (somebody suggested this to me a few years back; on the other hand I decided to read this because me and some friends were to a fair where also the author was…). Nice book indeed, just a bit “slow” (took me almost a month to read it fully, and it was just 400 pages). Next step, though, I wanted to come back to Dresden’s Files; Butcher’s style is enchanting. Three books out, I was up to read Summer KnightGrave Peril I got from Kobo so I assumed they had the next as well; somehow, they don’t. So at the end I got it from WHSmith; it bears little difference, but it still strikes me as odd.

And in all this, there seems to be no shop for Italian eBooks; sigh. If only ChiareLettere had ePubs available.. their books are quite bulky and I would love to give them away and trading them for digital copies of them. I wonder if I should get more (technical) skills about this kind of publishing and propose to handle that kind of stuff myself. I would also know where to start, maybe.

Technical eBooks? Scarcer than I’d have said!

Do you remember I went back to using the Reader with proper (ePub) content? It also turned out pretty well when I could get a newly-released book even before it’s released in Europe (and for a much lower price).

A month after resuming this, I have to accept an absurd reality: it’s much easier to find novels than technical books in ePub format! Now it is true that most of the O’Reilly catalogue is available in ePub format (one exception being CJKV Information Processing which, for the complexity of the script, is only available as PDF — and it would still have been pretty expensive, if I couldn’t make it to the 1-day offer the other day of getting any eBook for $10; for that price, even just a PDF is good enough), but they seem to be an exception.

Indeed, Addison Wesley does not seem to have their catalogue available as eBook at all! And they tend to have some very interesting books – some of which I read thanks to the gifts received – if they had them available as eBook, I would probably be buying a few more of them!

Tonight I was also looking at MIT Press since I would like to convert my current shelf to eBooks, for those titles that I’m still interesting to have around, and which are available as eBook obviously; Using OpenMP is one of them — my idea was to know ho much they would cost me as eBook, which is usually a fraction of their original price, and “sell” them for the same price to interested friends. While they do have an eBook store, it doesn’t have their older titles, and it leaves a lot to be desired.

My reason for wanting to convert what I have already is that I’m getting ready to pack and get the hell away from home; nots of things are going on and I’m in the middle of a very nasty family situation. I’ll be looking for my luck elsewhere, most likely in Turin, hopefully later on this year. But before leaving, I’m trying to get rid of some baggage, both psychological and physical; books are something that, while I’d be sad splitting from, I cannot afford to bring with me when I’ll be moving.

Incidentally, I’m in a bit of a pinch with CDs and DVDs as well… I already rip all my CDs and the music DVDs to bring them with me more easily on the iPod — but I don’t want to get rid of the originals; I guess that once I move I might still get some “physical storage space” here, to keep them. I already moved to buying music digitally – through the iTunes store, thankfully they don’t have DRM any more! – but audiobooks are still crippled protected, as they tell you, and metal loses some edges when encoded. Let’s not even get into digitally-distributed movies. And yes, I’m the kind of person who gladly pays for content.

On the other hand, for what concerns fiction and non-fiction books, there are quite a few possible stores, such as -WHSmith- Kobo and Waterstones — the only problem I got with them is that none of them supports a wishlist; I’d love to replace the one I have now on Amazon with one for eBook: they’d be cheaper and I’d have less trouble bringing them around.

Anyway, I’m still baffled by the lack of vast archives of technical eBooks.