Adding semantic data to the blog

You probably remember I’ve been a bit of a semantic data nerd, as I’ve added support for RDF and quite a bit of other extra metadata to both my blog and FSWS a long time ago. Well, since I was already looking into updating Autotools Mythbuster to make the website use XHTML5, I went to look at which options are available for it, since the old RDFa syntax is no longer available.

After some reading about, even if RDFa is still supported, with a new syntax, it turns out the currently preferred way to declare semantic data on a webpage is through which is not a standards body but just a cooperation between the “big guys” in the web search business, mostly Google and Microsoft, not unlike which it shares the design with as well. The idea is that instead of having the two dozens vocabulary to express metadata, you get a big one that includes most of the important metadata that is useful for search engines to get the right data out of a page — after all, what is that data there to do, beside making it easier for search engines to find your important bits?

So I started with my blog, ripping out the old RDF metadata and starting to set up the new one. For the most part, it turned out to be easy, although one of the biggest problems was avoiding having too-redundant metadata. For instance, both the blog itself and the single article have me as author (it’s not properly correct, as there could be more authors in the blog, but in my case there are none, so…), and I was going crazy trying to use the itemref attribute to get it to use the author data already expressed at the top level — the trick is that the correct way to express this is:

The trick here is that you don’t have to define an author property for the article, but just import the global one at the article level; the documentation for that out there is quite lacking, and the result has been that I wasted the morning trying to get Google to process the data correctly.

At any rate, the experiment turned out decent enough, which means that the next day or two is going to be spent to get FSWS to emit the same kind of data, then I should have a good starting point to make Autotools Mythbuster use the same syntax.