Adding semantic data to the blog

You probably remember I’ve been a bit of a semantic data nerd, as I’ve added support for RDF and quite a bit of other extra metadata to both my blog and FSWS a long time ago. Well, since I was already looking into updating Autotools Mythbuster to make the website use XHTML5, I went to look at which options are available for it, since the old RDFa syntax is no longer available.

After some reading about, even if RDFa is still supported, with a new syntax, it turns out the currently preferred way to declare semantic data on a webpage is through schema.org which is not a standards body but just a cooperation between the “big guys” in the web search business, mostly Google and Microsoft, not unlike sitemaps.org which it shares the design with as well. The idea is that instead of having the two dozens vocabulary to express metadata, you get a big one that includes most of the important metadata that is useful for search engines to get the right data out of a page — after all, what is that data there to do, beside making it easier for search engines to find your important bits?

So I started with my blog, ripping out the old RDF metadata and starting to set up the new one. For the most part, it turned out to be easy, although one of the biggest problems was avoiding having too-redundant metadata. For instance, both the blog itself and the single article have me as author (it’s not properly correct, as there could be more authors in the blog, but in my case there are none, so…), and I was going crazy trying to use the itemref attribute to get it to use the author data already expressed at the top level — the trick is that the correct way to express this is:

The trick here is that you don’t have to define an author property for the article, but just import the global one at the article level; the documentation for that out there is quite lacking, and the result has been that I wasted the morning trying to get Google to process the data correctly.

At any rate, the experiment turned out decent enough, which means that the next day or two is going to be spent to get FSWS to emit the same kind of data, then I should have a good starting point to make Autotools Mythbuster use the same syntax.

6 thoughts on “Adding semantic data to the blog

  1. Is there any test tool that displays semantic metadata if it could be parsed from the content? I would like to play with it in my blog…

    Like

  2. Yes, Google has a “Rich Snippet”:http://www.google.com/webma… testing tool that will tell you what it finds in the page. Microsoft also have a similar tool, but it requires a Microsoft Account to get to it (I have one that I use mostly for XBox Live), it’s on their bing webmaster tools.For RDFa metadata instead http://validator.nu/ should do, but it doesn’t seem to understand the microformat used by schema.org.

    Like

  3. I miss the good old days when everyone blogged about the changes they did to their blogs. I like the summaries of all the quirky little details and technicalities that were changed.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s