Publishing Documentation

I have been repeating for years that blogs are not documentation out of themselves. While I have spent a lot of time over the years to make sure that my blog’s links are not broken, I also know that many of my old blog posts are no longer relevant at all. The links out of the blog can be broken, and it’s not particularly easy to identify them. What might have been true in 2009 might not be true in 2020. The best option for implementing something has likely changed significantly, given how ten years ago, Cloud Computing was barely a thing on the horizon, and LXC was considered an experiment.

This is the reason why Autotools Mythbuster is the way it is: it’s a “living book” — I can update and improve it, but at the same time it can be used as a stable reference of best practices: when they change it gets updated, but the link is still a pointer to the good practice.

At work, I pretty much got used to “Radically Simple Documentation” – thanks to Riona and her team. Which pretty much means I only needed to care about the content of the documentation, rather than dealing with how it would render, either in terms of pipeline or style.

And just like other problems with the bubble, when I try to do the same outside of it, I get thoroughly lost. The Glucometer Protocols site had been hosted as GitHub pages for a few years by now — but I now wanted to add some diagrams, as more modern protocols (as well as some older, but messier, protocols) would be much simpler to explain with UML Sequence Diagrams to go with.

The first problem was of course to find a way to generate sequence diagrams out of code that can be checked-in and reviewed, rather than as binary blobs — and thankfully there are a few options. I settled for blockdiag because it’s the easiest to set up in a hurry. But it turned out that integrating it is far from easy as it would seem.

While GitHub pages uses Jekyll, it uses such an old version that reproducing that on Netlify is pretty much impossible. Most of the themes that are available out there are mostly dedicated to personal sites, or ecommerce, or blogs — and even when I found one that seemed suitable for this kind of reference, I couldn’t figure out how to to get the whole thing to work. And it didn’t help that Jekyll appears to be very scant on debug logging.

I tried a number of different static site generators, including a few in JavaScript (which I find particularly annoying), but the end result was almost always that they seemed more geared towards “marketing” sites (in a very loose sense) than references. To this moment, I miss the simplicity of g3doc.

I ended up settling for Foliant, which appears to be more geared towards writing actual books than reference documentation, but wraps around MkDocs, and it provides a plugin that integrates with Blockdiag (although I still have a pending pull request to support more diagram types). And with a bit of play around it, I managed to get Netlify to build this properly and serve it. Which is what you get now.

But of course, since MkDocs (and a number of other Python-based tools I found) appear to rely on the same Markdown library, they are not even completely compatible with the Markdown as written for Jekyll and GitHub pages: the Python implementation is much stricter when it comes to indentation, and misses some of the feature. Most of those appear to have been at some point works in progress, but there doesn’t seem to be much movement on the library itself.

Again, these are relatively simple features I came to expect for documentation. And I know that some of my (soon-to-be-former) colleagues have been working on improving the state of opensource documentation frameworks, including Lisa working on Docsy, which looks awesome — but relies on Hugo, which I still dislike, and seems to have taken a direction which is going further and further away from me (the latest when I was trying to set this up is that to use Hugo on Linux they now seem to require you to install Homebrew, because clearly having something easy for Linux packagers to work with is not worth it, sigh).

I might reconsider that, if Hugo finds a way to implement building images out of other tools, but I don’t have strong expectations that the needs for documentation reference would be considered for future updates to Hugo, given how it was previously socialized as a static blog engine, only to pivot to needs that would make it more “marketable”.

I even miss GuideXML, to a point. This was Gentoo’s documentation format back in the days before the Wiki. It was complex, and probably more complicated than it should have been, but at least the pipeline to generate the documentation was well defined.

Anyhow, if anyone out there has experience in setting up reference documentation sites, and wants to make it easier to maintain a repository of information on glucometers, I’ll welcome help, suggestions, pull requests, and links to documentation and tools.

From Textile to Markdown

You probably remember by now my blog’s woes with the piece of crap that goes for a blog engine I’m using right now. Since enough people asked me to please not kill the blog, and since I would like, if I keep the blog around, to keep it up to date (if it were to go stale forever I would proceed with my idea of killing it altogether, and maybe make a book out of it), I’m looking in what options I got.

The options for me are either to move to a different blog engine altogether, or to just cut features out of the currently-running version of Typo to remove the errors and the stuff I don’t need. Please do not try to suggest me to use a static generator as I don’t care about it: I want to be able to edit my blog online, and I want to be able to have comments. And I’m not going to maintain two different pieces of software to maintain a static engine and a comment system, and no I’m not going to use Disqus or similar. Full stop.

Whichever way it goes, one thing that I need to change is the text format the posts are written in. Right now most of it is written in Textile. I’m not sure if the choice was simply due to Markdown sucking at the time, or this being the format used by Serendipity that I was using before. In any case, Textile lost the text format war and everything is Markdown nowadays. So I changed the settings for the new posts and I’m writing them in Markdown. The problem is converting the old posts.

Now thankfully I’ve already been pointed at pandoc which is a great tool… but its support for Textile, like most platforms, is not really perfect. For instance, lists are not properly converted; bullet lists are only evaluated correctly if the line does not start with a space (even though the Ruby gem for Textile supports it starting with space), and image URLs, which are expressed between exclamation marks, are matched across lines, making a mess of posts where more than one exclamation mark is present.

I can probably work those two issues around during the conversion (I already have a script that can pass all the posts through pandoc to convert them to Markdown), but there are bound to be more issues.. which means I’ll have to go through all my posts (or most of them at least) to make sure that my posts have been converted correctly. Is anybody with a liking for Haskell willing to fix these smaller issues for me?

Hopefully, I’ll soon be able to stop relying on Textile, and the multiple text filters are going to be the first thing I get rid of in Typo, as their execution requires database access for no good reason…

Future planning for Ruby-Elf

My work on Ruby-Elf tends to happen in “sprees” whenever I actually need something from it that wasn’t supported before — I guess this is true for many projects out there, but it seems to happen pretty regularly for me with my projects. The other day I prepared a new release after fixing the bug I found while doing the postmortem of a libav patch — and then I proceeded giving another run to my usual collisions check after noting that I could improve the performance of the regular expressions …

But where is it directed, as it is? Well, I hope I’ll be able to have version 2.0 out before end of 2013 — in this version, I want to make sure I get full support for archives, so that I can actually analyze static archives without having to extract them beforehand. I’ve got a branch with the code to get access to the archives themselves, but it can only extract the file before actually being able to read it. The key in supporting archives would probably be supporting in-memory IO objects, as well as offset-in-file objects.

I’ve also found an interesting gem called bindata which seems to provide a decent way to decode binary data in Ruby without having to actually fully pre-decode it. This would probably be a killer for Ruby-Elf, as a lot of the time I’m forcibly decoding everything because it was extremely difficult to access it on the spot — so the first big change for Ruby-Elf 2 is going to be to drop down the task of decoding to bindata (or, otherwise, another similar gem).

Another change that I plan is to drop the current version of the man pages. While DocBook is a decent way to deal with man pages, and standard enough to be around in most distributions, it’s one “strange” dependency for a Ruby package — and honestly the XML is a bit too verbose sometimes. For the most horsey beefy man pages, the generated roff page is half as big as the source, which is the other way around from what anybody would expect them.

So I’m quite decided that the next version of Ruby-Elf will use Markdown for the man pages — while it does not have the same amount of semantic tagging, and thus I might have to handle some styling in the synopsis manually, using something like md2man should be easy (I’m not going to use ronn because of the old issue with JRuby and rdiscount) and at the same time, it gives me a public HTML version for free, thanks to GitHub conversion.

Finally, I really hope that by Ruby-Elf 2 I’ll be able to get least the symbol demangler for the Itanium C++ ABI — that is the one used by modern GCC, yes, it was originally specified for the Itanic. Working toward supporting the full DWARF specification is something that is on the back of my mind but I’m not very convinced right now, because it’s huge. Also, if I were to implement it I would then have to rename the library to Dungeon.