Who consumes the semantic web?

In my previous post I’ve noted that I was adding support for the latest fad method for semantic tagging of data on web pages, but it was obviously not clear who actually consumes that data. So let’s see.

In the midst of the changes to Typo that I’ve been sending to support a fully SSL-compatible blog install (mine is not entirely yet, mostly because most of the internal links from one post to the next are not currently protocol-relative), I’ve added one commit to give a bit more OpenGraph insights — OpenGraph is used by Facebook, almost exclusively. The only metadata that I provide on that protocol, though, is an image for the blog – since I don’t have a logo, I’m sending my gravatar – the title of the single page and the global site title.

Why that? Well, mostly because this way if you do post a link to my blog on facebook, it will appear with the title of the post itself instead of the one that is visible on the page. This solves the problem of whether the title of the blog itself should be dropped out of the <title> tag.

For what concerns Google, instead, the most important part of metadata you can provide them seems to be authorship tagging which uses Google+ to connect content of the same author. Is this going to be useful? Not sure yet, but at least it shows up in a less anonymous way in the search results, and that can’t be bad. Unlike what they say on the link, it’s possible to use an invisible <link> tag to connect the two, which is why you don’t find a G+ logo on my blog anywhere.

What else do search engines do with the remaining semantic data? Not sure, it doesn’t seem to explain it, and since I don’t know what it does behind the scenes it’s hard for me to give a proper answer. But I can guess, and hope, that they use it to reduce the redundancy of the current index. For instance, pages that are actually a list of posts, such as the main index, the categories/tags and archives will now properly tell that they are describing a blog posting whose URL is, well, somewhere else. My hope would be for the search engines to know then to link to the declared blog post’s URL instead of the index page. And possibly boost the results for the posts that result more popular (given they can then count the comments). What I’m surely counting on, is for descriptions in search results to be more humanly-centered.

Now in the case of Google you can use their Rich Snippet testing tool that gives you an idea of what it finds. I’m pretty sure that they take all this data with a grain of salt though, seeing as how many players are there in the “SEO” world, with people trying to game the system altogether. But at least I can hope that things will move in the right direction.

Interestingly, when I first implemented the new semantic data, Readability did not support it, and would show my blog’s title instead of the post’s title when reading the articles from there — after a feedback on their site they added some workaround for my case, so you can enjoy their app with my content just fine. Hopefully, with time, the microformat will be supported in the general sense.

On the other hand, Flattr still has no improvement on using metadata, as far as I can see. They require that you actually add a button manually, including repeating that kind of metadata (content type, language, tags) that is already easily inferred from the microformat given. Hereby, I’d like to reiterate my plea to Flattr developers to listen to OpenGraph and other microformat data, and at least use that to augment the manually-inserted buttons. Supporting the schema.org format, by the way, should make it relatively easy to add per-fragment buttons — i.e., I wouldn’t mind having a per-comment Flattr button to reward constructive comments, like they have on their own blog, but without the overhead that it adds to do so manually.

Right now this is all the semantic data that I figured out that is being used. Hopefully things will become more useful in the future.

Thoughts on Flattr

First of all I wish to thank Sebastian who gave me the invite to Flattr to begin with, a few months ago. This was not only helpful to me, but also allowed me to have a bit of insight on the whole evolution of the service over these.

Now, I haven’t really written much about it before, mostly because I didn’t want to stress its presence — if you read my blog through syndication on Planet Gentoo or Gentoo Universe, the fact that the posts have a flattr button is hidden from you altogether. But since there has been a lot going on in the past month, I thought it might have been the right moment to write down some notes about it.

You might or might not have heard the fact that Debian developers got into bickering about what is and is not allowed on Planet Debian, mostly about Flattr, with a number of very different ideas about why it is bad for their developers to have Flattr at all – ranging from calling it a scam for the 10% they get from the poured money, to calling the “social networking buttons” webbugs (including Feedburner), to simply saying that it’s a commercial use of the Debian infrastructure – whatever the outcome, I don’t think Flattr stands to gain many fans here.

But what I found interesting in that outcome is the consideration that the Flattr platform itself might not be faring too well; the reason why that’s said is that after the initial money inflow of the registration, just like the “real world economy”, the consumers pay, and the producers get the money. In my case, I haven’t really had the need to add any means since I registered, and while I feel bad about it, I don’t think that will change anytime soon (mostly, because I’m actually going through a patch rough enough that I cannot afford to do so realistically; even though I promised myself not to withdraw funds anytime soon at least, I ended up doing so to pay for the SSL certificate from StartSSL, as well — so thanks to all of you who contributed!).

There are also a few issues with the technical interface of the service: when I decided to implement support for flattr in my fsws (which I have opensourced already, but I haven’t had time to document or in general write about), I started finding a few issues with the resulting widget not being properly usable within XHTML, but those were thankfully fixed already in the newest generation of Javascript code. On the other hand, when they decided to update their widgets, the change went live to all the users without warning, dang!

The one thing that it seems to lack here, though, is support for the same metadata API used by Facebook: OpenGraph — right now you have to provide the URL to flattr, the description, the tags, the type and so on so forth whenever you provide a flattr widget. This means that any of the index pages of my blog is repeating the same content many, many times. On the other hand, if it was using OpenGraph, maybe slightly extended (even Facebook has a fb_admins value that is not documented there), it would only take the URL, then the rest of the service would be able to access the details from there.

Having such a situation would also make it nicer to have “Flattr anything” extensions: right now they really cannot work, because you do not know who owns a given webpage, which kind of webpage it is, which tags to use, and so on; and you cannot either declare a canonical URL to use, so the same page could be submitted twice, if it is reachable by two different URLs (like most of this blog is); and this is considering we have two, not one, methods to declare these canonical URLs.

Finally, I have to say that from at least one side, Flattr has proven itself profitable to me; in the past two months, where its usage surged, I received around €20/month with it, which is definitely not a lot (at least considering the expenses Tinderbox has caused me in the same time), but it’s not little. Once again, thank you all.

Also it has been funny to see some of my older posts being flattred from time to time, I guess some people actually ended up finding them with google, I hope they found what they were looking for.