In my previous post I’ve noted that I was adding support for the latest fad method for semantic tagging of data on web pages, but it was obviously not clear who actually consumes that data. So let’s see.
In the midst of the changes to Typo that I’ve been sending to support a fully SSL-compatible blog install (mine is not entirely yet, mostly because most of the internal links from one post to the next are not currently protocol-relative), I’ve added one commit to give a bit more OpenGraph insights — OpenGraph is used by Facebook, almost exclusively. The only metadata that I provide on that protocol, though, is an image for the blog – since I don’t have a logo, I’m sending my gravatar – the title of the single page and the global site title.
Why that? Well, mostly because this way if you do post a link to my blog on facebook, it will appear with the title of the post itself instead of the one that is visible on the page. This solves the problem of whether the title of the blog itself should be dropped out of the <title>
tag.
For what concerns Google, instead, the most important part of metadata you can provide them seems to be authorship tagging which uses Google+ to connect content of the same author. Is this going to be useful? Not sure yet, but at least it shows up in a less anonymous way in the search results, and that can’t be bad. Unlike what they say on the link, it’s possible to use an invisible <link>
tag to connect the two, which is why you don’t find a G+ logo on my blog anywhere.
What else do search engines do with the remaining semantic data? Not sure, it doesn’t seem to explain it, and since I don’t know what it does behind the scenes it’s hard for me to give a proper answer. But I can guess, and hope, that they use it to reduce the redundancy of the current index. For instance, pages that are actually a list of posts, such as the main index, the categories/tags and archives will now properly tell that they are describing a blog posting whose URL is, well, somewhere else. My hope would be for the search engines to know then to link to the declared blog post’s URL instead of the index page. And possibly boost the results for the posts that result more popular (given they can then count the comments). What I’m surely counting on, is for descriptions in search results to be more humanly-centered.
Now in the case of Google you can use their Rich Snippet testing tool that gives you an idea of what it finds. I’m pretty sure that they take all this data with a grain of salt though, seeing as how many players are there in the “SEO” world, with people trying to game the system altogether. But at least I can hope that things will move in the right direction.
Interestingly, when I first implemented the new semantic data, Readability did not support it, and would show my blog’s title instead of the post’s title when reading the articles from there — after a feedback on their site they added some workaround for my case, so you can enjoy their app with my content just fine. Hopefully, with time, the microformat will be supported in the general sense.
On the other hand, Flattr still has no improvement on using metadata, as far as I can see. They require that you actually add a button manually, including repeating that kind of metadata (content type, language, tags) that is already easily inferred from the microformat given. Hereby, I’d like to reiterate my plea to Flattr developers to listen to OpenGraph and other microformat data, and at least use that to augment the manually-inserted buttons. Supporting the schema.org format, by the way, should make it relatively easy to add per-fragment buttons — i.e., I wouldn’t mind having a per-comment Flattr button to reward constructive comments, like they have on their own blog, but without the overhead that it adds to do so manually.
Right now this is all the semantic data that I figured out that is being used. Hopefully things will become more useful in the future.