How blogging changed in the past ten years

One of the problems that keeps poking back at me every time I look for an alternative software for this blog, is that it somehow became not your average blog, particularly not in 2017.

The first issue is that there is a lot of history. While the current “incarnation” of the blog, with the Hugo install, is fairly recent, I have been porting over a long history of my pseudo-writing, merging back into this one big collection the blog posts coming from my original Gentoo Developer blog, as well as the few posts I wrote on the KDE Developers blog and a very minimal amount of content from my (mostly Italian) blog when I was in high school.

Why did I do it that way? Well the main thing is that I don’t want to lose the memories. As some of you might know already, I faced my mortality before, and I came to realize that this blog is probably the only thing of substance that I had a hand on, that will outlive me. And so I don’t want to just let migration, service turndowns, and other similar changes take away what I did. This is also why I did publish to this blog the articles I wrote for other websites, namely NewsForge and Linux.com (back when they were part of Geeknet).

Some of the recovery work actually required effort. As I said above there’s a minimal amount of content that comes from my high school days blog. And it’s in Italian that does not make it particularly interesting or useful. I had deleted that blog altogether years and years ago, so I had to use the Wayback Machine to recover at least some of the posts. I will be going through all my old backups in the hope of finding that one last backup that I remember making before tearing the thing down.

Why did I tear it down in the first place? It’s clearly a teenager’s blog and I am seriously embarrassed by the way I thought and wrote. It was 1314 years ago, and I have admitted last year that I can tell so many times I’ve been wrong. But this is not the change I want to talk about.

The change I want to talk about is the second issue with finding a good software to run my blog: blogging is not what it used to be ten years ago. Or fifteen years ago. It’s not just that a lot of money got involved in the mean time, so now there are a significant amount of “corporate blogs”, that end up being either product announcements in a different form, or the another outlet for not-quite-magazine content. I know of at least a couple of Italian newspapers that provide “blogs” for their writers, which look almost exactly like the paper’s website, but do not have to be reviewed by the editorial board.

In addition to this, a lot of people’s blogs stopped providing as much details of their personal life as they used to. Likely, this is related to the fact that we now know just how nasty people on the Internet can be (read: just as nasty as people off the Internet), and a lot of the people who used to write lightheartedly don’t feel as safe, correctly. But there is probably another reason: “Social Media”.

The advent of Twitter and Facebook made it so that there is less need to post short personal entries, too. And Facebook in particular appears to have swallowed most of the “cutesy memes” such as quizzes and lists of things people have or have not done. I know there are still a few people who insist on not using these big names social networks, and still post for their friends and family on blogs, but I have a feeling they are quite the minority. And I can tell you for sure that since I signed up for Facebook, a lot of my smaller “so here’s that” posts went away.

Distribution chart of blog post sizes over time

This is a bit of a rough plot of blog sizes. In particular I have used the raw file size of the markdown sources used by Hugo, in bytes, which make it not perfect for Unicode symbols, and it includes the “front matter”, which means that particularly all the non-Hugo-native posts have their title effectively doubled by the slug. But it shows trends particularly well.

You can see from that graph that some time around 2009 I almost entirely stopped writing short blog posts. That is around the time Facebook took off in Italy, and a lot of my interaction with friends started happening there. If you’re curious of that visible lack of posts just around half of 2007, that was the pancreatitis that had me disappear for nearly two months.

With this reduction in scope of what people actually write on blogs, I also have a feeling that lots of people were left without anything to say. A number of blogs I still follow (via NewsBlur since Google Reader was shut down), post once or twice a year. Planets are still a thing, and I still subscribe to a number of them, but I realize I don’t even recognize half the names nowadays. Lots of the “old guard” stopped blogging almost entirely, possibly because of a lack of engagement, or simply because, like me, many found a full time job (or a full time family), that takes most of their time.

You can definitely see from the plot that even my own blogging has significantly slowed down over the past few years. Part of it was the tooling giving up on me a few times, but it also involves the lack of energy to write all the time as I used to. Plus there is another problem: I now feel I need to be more accurate in what I’m saying and in the words I’m using. This is in part because I grew up, and know how much words can hurt people even when meant the right way, but also because it turns out when you put yourself in certain positions it’s too easy to attack you (been there, done that).

A number of people that think argue that it was the demise of Google Reader1 that caused blogs to die, but as I said above, I think it’s just the evolution of the concept veering towards other systems, that turned out to be more easily reachable by users.

So are blogs dead? I don’t think so. But they are getting harder to discover, because people use other platforms and it gets difficult to follow all of them. Hacker News and Reddit are becoming many geeks’ default way to discover content, and that has the unfortunate side effect of not having as much of the conversation to happen in shared media. I am indeed bothered about those people who prefer discussing the merit of my posts on those external websites than actually engaging on the comments, if nothing else because I do not track those platforms, and so the feeling I got is of talking behind one’s back — I would prefer if people actually told me if they shared my post on those platforms; for Reddit I can at least IFTTT to self-stalk the blog, but that’s a different problem.

Will we still have blogs in 10 years? Probably yes, but they will not look like the ones we’re used to most likely. The same way as nowadays there still are personal homepages, but they clearly don’t look like Geocities, and there are social media pages that do not look like MySpace.


  1. Usual disclaimer: I do work for Google at the time of writing this, but these are all personal opinions that have no involvement from the company. For reference, I signed the contract before the Google Reader shutdown announcement, but started after it. I was also sad, but I found NewsBlur a better replacement anyway.
    [return]

Tiny Tiny RSS: don’t support Nazi sympathisers


XKCD #1357 — Free Speech

After complaining about the lack of cache hits from feed readers, and figuring out why NewsBlur (that was doing the right thing), and then again fixing the problem, I started looking at what other readers kept being unfixed. It turned out that about a dozen people used to read my blog using Tiny Tiny RSS, a PHP-based personal feed reader for the web. I say “used to” because, as of 2017-08-17, TT-RSS is banned from accessing anything from my blog via ModSecurity rule.

The reason why I went to this extent is not merely technical, which is why you get the title of this blog the way it is. But it all started with me filing requests to support modern HTTP features for feeds, particularly regarding the semantics of permanent redirects, but also about the lack of If-Modified-Since, which allows significant reduction on the bandwidth usage of a blog1. Now, the first response I got about the permanent redirect request was disappointing but it was a technical answer, so I provided more information. After that?

After that the responses stopped being focused on the technical issues, but rather appear to be – and that’s not terribly surprising in FLOSS of course – “not my problem”. Except, the answers also came from someone with a Pepe the Frog avatar.2 And this is August of 2017, when America shown having a real Nazi problem, and willingly associating themselves to alt-right is effectively being Nazi sympathizers. The tone of the further answers also show that it is no mistake or misunderstanding.

You can read the two bugs here: and . Trigger warning: extreme right and ableist views ahead.

While I try to not spend too much time on political activism on my blog, there is a difference from debating whether universal basic income (or even universal health care) is a right nor not, and arguing for ethnic cleansing and the death of part of a population. So no, no way I’ll refrain from commenting or throwing a light on this kind of toxic behaviour from developers in the Free Software community. Particularly when they are not even holding these beliefs for themselves but effectively boasting them by using a loaded avatar on their official support forum.

So what you can do about this? If you get to read this post, and have subscribed to my blog through TT-RSS, you now know why you don’t get any updates from it. I would suggest you look for a new feed reader. I will as usual suggest NewsBlur, since its implementation is the best one out there. You can set it up by yourself, since it’s open source. Not only you will be cutting your support to Nazi sympathisers, but you also will save bandwidth for the web as a whole, by using a reader that actually implements the protocol correctly.

Update (2017-08-06): as pointed out in the comments by candrewswpi, FreshRSS is another option if you don’t want to set up NewsBlur (which admittedly may be a bit heavy). It uses PHP so it should be easier to migrate given the same or similar stack. It supports at least proper caching, but I’m not sure about the permanent redirects, it needs testing.

You could of course, as the developers said on those bugs, change the User-Agent string that TT-RSS reports, and keep using it to read my blog. But in that case, you’d be supporting Nazi sympathisers. If you don’t mind doing that, I may ask you a favour and stop reading my blog altogether. And maybe reconsider your life choices.

I’ll repeat here that the reason why I’m going to this extent is that there is a huge difference between the political opinions and debates that we can all have, and supporting Nazis. You don’t have to agree with my political point of view to read my blog, you don’t have to agree with me to talk with me or being my friend. But if you are a Nazi sympathiser, you can get lost.


  1. you could try to argue that in this day and age there is no point in worrying about bandwidth, but then you don’t get to ever complain about the existence of CDNs, or the fact that AMP and similar tools are “undemocratizing” the web.
    [return]
  2. Update (2017-08-03): as many people have asked: no it’s not just any frog or any Pepe that automatically makes you a Nazi sympathisers. But the avatar was not one of the original illustrations, and the attitude of the commenter made it very clear what their “alignment” was. I mean, if they were fans of the original character, they would probably have the funeral scene as their avatar instead.
    [return]

Modern feed readers

Four years ago, I wrote a musing about the negative effects of the Google Reader shutdown on content publishers. Today I can definitely confirm that some of the problems I foretold materialized. Indeed, thanks to the fortuitous fact that people have started posting my blog articles to reddit and hacker news (neither of which I’m fond of, but let’s leave that aside), I can declare that the vast majority of the bandwidth used by my blog is consumed by bots and in particular by feed readers. But let’s start from the start.

This blog is syndicated over a feed, the URL and format of which changed a number of times before, mostly with the software, or with the update of the software. The most recent change was due to switching from Typo to Hugo, and the feed name changing. I could have kept the original feed name, but it made little sense at the time, so instead I set up permanent redirects from the old URLs to the new URLs, as I always do. I say I always do because I keep working even the original URLs from when I ran the blog off my home DSL.

Some services and feed reading software know how to deal with permanent redirects correctly, and will (eventually) replace the old feed URL with the new one. For instance NewsBlur will replace URLs after ten fetches replied with a permanent redirect (which is sensible, to avoid accepting a redirection that was set up by mistake and soon rolled back, and to avoid data poisoning attacks). Unfortunately, it seems like this behaviour is extremely rare, and so on August 14th I received over three thousands requests for the old Typo feed URL (admittedly, that was the most persistent URL I used). In addition to that, I also received over 300 requests for the very old Typo /xml/ feeds, of which 122 still pointing at my old dynamic domain, which is now pointing at the normal domain for my blog. This has been the case now for almost ten years, and yet some people still have subscription to those URLs! At least one Liferea and one Akregator pointing at those URLs.

But while NewsBlur implements sane semantics for handling permanent redirects, it is far from a perfect implementation. In particular even though I have brought this up many times, Newsblur is not actually sending If-Modified-Since or If-None-Match headers, which means it will take a copy of the feed at every request. Even though it does support compressed responses (non fetch of the feed is allowed without compressed responses), NewsBlur is requesting the same URL more than twice an hour, because it seems to have two “sites” described by the same URL. At 50KiB per request, that makes up about 1% of the total bandwidth usage of the blog. To be fair, this is not bad at all, but one has to wonder why they can’t be saving the last modified or etag values — I guess I could install my own instance of NewsBlur and figure out how to do that myself, but who knows when I would find the time for that.

Update (2017-08-16): Turns out that, as Samuel pointed out in the comments and on Twitter, I wrote something untrue. NewsBlur does send the headers, and supports this correctly. The problem is an Apache bug that causes 304 never to be issued when using If-None-Match and mod_deflate.

To be fair, even rawdog, which I use for Planet Multimedia, does not appear to support these properly. Oh and speaking of Planet Multimedia, would someone be interested in providing a more modern template so that Monty’s pictures don’t take over the page, that would be awesome!

There actually are a few other readers that do support these values correctly, and indeed receive 304 (Not Modified) status code most of the time. These include Lighting (somebody appears to be still using it!) and at least yet-another-reader-service Willreadit.com — this latter appears to be in beta and being invite only; it’s probably the best HTTP implementation I’ve seen for a service with such a rough website. Indeed the bot landing page points out how it supports If-Modified-Since and gzip-compressed responses. Alas it does not appear to learn from persistent redirects though, so it’s currently fetching my blog’s feed twice, probably because there are at least two subscribers for it.

Also note that supporting If-Modified-Since is a prerequisite for supporting delta feeds which is an interesting way to save even more bandwidth (although I don’t think this is feasible to do with a static website at all).

At the very least it looks like we won the battle for supporting compressed responses. The only 406 (Not Acceptable) responses for the feed URL are for Fever, which is no longer developed or supported. Even Gwene, which I pointed out was hammering my blog last time I wrote about this, is now content to get the compressed version. Unfortunately it does not appear like my pull request was ever merged, which means it’s likely the repository itself is completely out of sync with what is being run.

So in 2017, what is the current state of the art feed reader support? NewsBlur has recently added support for JSON Feed which is not particularly exciting – when I read the post I was reminded, by the screenshot of choice there, where I heard of Brent Simmons before: Vesper, which is an interesting connection, but I should not go into that now – but at least shows that Samuel Clay is actually paying attention to the development of the format — even though that development right now appears to just avoiding XML. Which to be honest is not that bad of an idea: since HTML (even HTML5) does not have to be well-formatted XML, you need to provide it as cdata in an XML feed. And the way you do that makes it very easy to implement it incorrectly.

Also, as I wrote this post I realized what else I would like from NewsBlur: the ability to subscribe to an OPML feed as a folder. I still subscribe to lots of Planets, even though they seem to have lost their charm, but a few people are aggregated in multiple planets and it would make sense to be able to avoid duplicate posts. If I could tell NewsBlur «I want to subscribe to this Planet, aggregate it into a folder», it would be able to tell the duplicated feeds, and mark the posts as read on all of them at the same time. Note that what I’d like is something different from just importing an OPML description of the planet! I would like for the folder to be kept in sync with the OPML feed, so that if new feeds are added, they also get added to the folder, and same for removed feeds. I should probably file that on GetSatisfaction at some point.

IPv6: WordPress has a long way to go, too

I recently complained about Hugo and the fact that it seems like its development was taken over by SEO-types, that changed its defaults to something I’m not comfortable with. In the comments to that post I have let it understood that I’ve been looking into WordPress as an alternative once again.

The reason why I’m looking into WordPress is that I expected it to be a much easier setup, and (assuming I don’t go crazy on the plugins) an easily upgradeable install. Jürgen told me that they now support Markdown, and of course moving to WordPress means I don’t need to keep using Disqus for comments, and I can own my comments again.

The main problem with WordPress, like most PHP apps, is that it requires particular care for it to be set up securely and safely. Luckily I do have some experience with this kind of work, and I thought I might as well share my experience and my setup once I got it running. But here is where things got complicated, to the point I’m not sure if I have any chance of getting this working, so I may have to stick with Hugo for much longer than I was hoping. And almost all of the problems fall back to the issue that my battery of test servers are IPv6-only. But don’t let me get ahead of myself.

After installing, configuring, and getting MySQL, Apache, and PHP-FPM to all work together nicely (which believe me was not obvious), I tried to set up the Akismet plugin, which failed. I ignored that, removed it, and then figured out that there is no setting to enable Markdown at all. Turns out it requires a plugin, which, according again to Jürgen, is the Jetpack plugin from WordPress.com itself.

Unfortunately, I couldn’t get the plugins page to work at all: it would just return an error connecting to WordPress.org. The first problem was that the Plugins page wouldn’t load at all, and a quick tcpdump later told me that WordPress tried connecting to api.wordpress.org. Which despite having eight separate IP addresses to respond from, has no IPv6. Well, that’s okay, I have a TinyProxy running on the host system that I use to fetch distfiles from the “outside world” that is not v6-compatible, so I just need to set this up, right? After all, I was already planning on disallowing direct network access to WordPress, so that’s not a big deal.

Well, the first problem is that the way to set up proxies with WordPress is not documented in the default wp-config.php file. Luckily I found that someone else wrote it down. And that started me on the right direction. Except it was not enough, as the list of plugins and the search page would come up, but they wouldn’t download, with the same error about not being able to establish a (secure) connection to WordPress.org, but only on the Apache error log at first — the page itself would have a debug trace if you ask WP to enable debug reporting.

Quite a bit of debugging later, with tcpdump and editing the source code, I found the problem: some of the requests sent by WordPress target HTTP endpoints, and others (including the downloads, correctly) target HTTPS endpoints. The HTTP endpoints worked fine, but the HTTPS ones failed. And the reason why they failed is that they tried to connect to TinyProxy with TLS. TinyProxy does not support TLS, because it really just performs the minimal amount of work needed of a proxy. And for what it’s worth, in my setup it only allows local connections, so there is no real value in adding TLS to it.

Turns out this bug is only present if PHP does not have curl support, and WordPress fallback to fsockopen. Enabling the curl USE flag for the ebuild was enough to fix the problem, and I reported the bug. I honestly wonder if the Gentoo ebuild should actually force curl on, for WordPress, but I don’t want to go there yet.

By the way, I originally didn’t want to say this on this blog post, but since it effectively went viral, I also found out at that point that the reason why I could get a list of plugins, is that when the connection to api.wordpress.org with HTTPS fails, the code retries explicitly with HTTP. It’s effectively a silent connection downgrade (you’d still find the warning in the log, but nothing would appear like breaking at first). This appears to include the “new version check” of WordPress, which makes it an interesting security issue. I reported it via WordPress Hacker One page before my tweet got viral — and sorry, I didn’t at first realize just how bad that downgrade was.

So now I have an installation of WordPress, mostly secure, able to look for, fetch and install plugins. Let me install JetPack to get that famous Markdown support that is to me a requirement and dealbreaker. For some reason (read: because WordPress is more Open Core than Open Source), it requires activating with a WordPress.com account. That should be easy, yes?

Error Details: The Jetpack server was unable to communicate with your site https:// [OMISSIS] [IXR -32300: transport error: http_request_failed cURL error 7: ]

I hid away the URL for my test server, simply to avoid spammers. The website is public, and it has a valid certificate (thank you Let’s Encrypt), and it is not firewalled or requiring any particular IP to connect to. But, IPv6 only. Which makes it easy for me, as it reduces the amount of scanners and spammers while I try it out, and since I have an IPv6-enabled connection at both home and the office, it makes it easy to test with.

Unfortunately it seems like the WordPress infrastructure not only is not reachable from the IPv6 Internet, but it does not even egress onto it either. Which makes once again the myth of IPv6-only networks infeasible. Contacting WordPress.com on Twitter ended up with them opening a support ticket for me, and a few exchanges and logs later, they confirmed their infrastructure does not support IPv6 and, as they said, «[they] don’t have an estimate on when [they] may».

Where does this leave me? Well, right now I can’t activate the normal Jetpack plugin, but they have a “development version”, which they assert is fully functional for Markdown, and that should let me keep evaluating WordPress without this particular hurdle. Of course this requires more time, and may end up with me hitting other roadblock at this point, I’m not sure. So we’ll see.

Whether I’ll get it to work or not, I will share my configuration files in the near future, because it took me a while to get them set up properly, and some of them are not really explained. You may end up seeing a new restyle of the blog in the next few months. It’s going to bother me a bit, because I usually prefer to kepe the blog the same way for years, instead, but I guess that needs to happen this time around. Also, changing the posts’ paths again means I’ll have to set up another chain of redirects. If I do that, I have a bit of a plan to change the hostname of the blog too.

Why I do not like Hugo

Not even a year ago, I decided to start using Hugo as the engine for this blog. This has mostly served me well, except for the fact that it relies on me having some kind of access to a console, a text editor, and my Bitbucket account, which made posting stuff while travelling a bit harder, so I opted instead for writing drafts, and then staggering their posts — which is why you now see that for the most part I post something once every three days, except for the free ideas.

Hugo was sold to me as a static generator for blogs, and indeed when I looked into it, that’s what it was clearly aiming on being. Sure the support for arbitrary taxonomies make it possible to use it in slightly different setups for a blog, but it was at that point seriously focusing on blog, and a few other similar site types. The integration with Disqus was pretty good from the start, as much as I’m not really happy about that choice, and the conversion proceeded mostly smoothly, although it took me weeks to make sure the articles were converted correctly, and even months in I dedicated a few nights a month just to go through the posts and make sure their formatting was right, or through the tags to collapse duplicates.

All in all, while imperfect, it was not as horrible as having to maintain my own Typo fork. Until last week.

I finally decided that maintaining a separate website for the projects is a bad idea. Not just for the style being out of sync between the two, but most importantly because I barely ever update that content, as most of my projects are dead or have their own website already (like Autotools Mythbuster) or they effectively are just using their GitHub repository as the main description, even though it pains me. So the best option I found is to just build the pages I care about into Hugo, particularly using a custom taxonomy for the projects, and be done with it. Except.

Except that to be able to do what I had in mind, I needed a feature that was committed after the version of Hugo I froze myself at, so I had to update. Updates with Typo were always extremely painful because of new dependencies, and new features, and changes to the database layout, and all those kind of problems. Certainly Hugo won’t have these problems! Except it decided not to be able to render the theme I was using, as one function got renamed from RSSlink to RSSLink.

That was an easy fix; a bit less easy at first was figuring out that someone decided that RSS feeds should include, unconditionally, the summary of the article, not the full text, because, and I quote: «This is a somewhat breaking change, but is what most people expect from their RSS feeds.»

I’m not sure what these “most people” are. And I’d say that if you want to change such as default, maybe you want it to be an option, but that does not seem to be Hugo’s style, as I’ll show later. But this is not why I’m angry. I’m angry because changing the RSS from full content to summary is a very clear change in impression.

An RSS feed that has full article content, is an RSS feed for a blog (or other site) that wants to be read. You can use this feed to syndicate on Planets (yes they still exist), read it on services like Feedly, or NewsBlur (no they did not all disappear with the death of Google Reader), and have it at hand on offline readers on your mobile devices, too.

RSS feeds that only carry summaries, are there to drive traffic to a site. And this is where the nasty smell around SEOs and similar titles come back in from below the door. I totally understand if one is trying to make a living off their website they want to be able to bring in traffic, which include ads views and the like. I have spoken about ads before, and though I recently removed it from the blog altogether for lack of any useful profit, I totally empathise with those who actually can make a profit and want people to see their ads.

But the fact that the tools decide to switch to this mode make me feel angry and sick, because they are no longer empowering people to make their view visible, they are empowering them to trick users into opening a website, to either get served ads, or (if they are geek enough to use Brave) give bitcoin to the author.

As it turns out, it’s not the only thing that happen to have changed with Hugo, and they all sound like someone decided to follow the path of WordPress, that went from a blogging engine to a total solution for managing websites — which is kind of what Typo did when becoming Publify. Except that instead of going to a general website solution, they decided to one-up all of them. From the same release notes of the version that changed the RSS feed defaults:

Hugo 0.20 introduces the powerful and long sought after feature Custom Output Formats; Hugo isn’t just that “static HTML with an added RSS feed” anymore. Say hello to calendars, e-book formats, Google AMP, and JSON search indexes, to name a few ( #2828 ).

Why would you want to build e-book formats and calendars with the same tool you used to build a blog with? Sure, if it actually was practical I could possibly make Autotools Mythbuster use this, but I somehow doubt it would have enough support for what I want to get out of the output, so I don’t even want to consider that for now. But all in all, it looks like widening a little too much the target field.

Anyway, I went and reverted the changes for my local build of Hugo. I ended up giving up on that by the way, and just applied a local template replacement instead, since that way I could also re-introduce another fix I needed for the RSS that was not merged upstream (the ability to put the taxonomy data into the feed, so you can use NewsBlur’s intelligence trainer to filter out some of my blog’s content). Of course maintaining a forked copy of the builtin template also means that it can break when I update if they decided that it should be FeedLink next time around.

Then I pushed the new version, including the updated donations page – which is not redirected from the old one yet, still working on that gone – and stopped looking too much onto it. I did this (purposefully) in the 3-days break between two posts, so that if something broke I would have time to fix it, but it looked everything was alright.

Until I noticed that I somehow flooded Planet Gentoo with a bunch of posts dating back up to 2006! And someone pinged me on Hangouts for the same reason. So I rolled back to the old version (that did not solve the flooding unfortunately), regenerated, and started digging what happened.

In the version of Hugo I used originally, the RSS feeds were fixed to 15 items. This is a perfectly reasonable debug for a blog, as I didn’t go anywhere near it even at at the time I was spending more time blogging than sleeping. But since Hugo is no longer targeting to be a blog engine, that’s not enough. “News” sites (and I use it in quote, because too many of those are actually either aggregators of other things, or just outright scammers, or fake news sites) would have many more than that per day, so 15 is clearly not a good option for them. So in Hugo 0.19 (the version before the one that changed to use summary), this change can be found:

Make RSS item limit configurable #3035

This is reasonable. The default is kept to 15, but now you can change it in the configuration file to whatever you want it to be, be it 20, 50, or 100.

What I did not notice at that point, was from the following version:

Raise the default rssLimit #3145

That sounds still good, no? It raises the limit. To what?

hugolib: Default rssLimit to unlimited

Of course this is perfectly fine for small websites that have a hundred or two pages. But this blog counts over 2400 articles, written over the span of 12 years (as I have recovered a number of blog posts from previous platforms, and I’m still always looking to see if I can find the backups with the posts of my 15 years old self). It ended up generating a 12MB RSS feed with every single page published up to them.

So what am I doing now? That, unfortunately, I’m not sure. This is the kind of bad upgrade path that frustrated the heck out of me with Typo. Unfortunately the only serious alternative I know to this is WordPress, and that still does not support Markdown unless you use a strange combinations of plugins and I don’t even want to get into that.

I am tempted to see what’s out there for Ruby-based blog engines, although at this point I’m ready to pay for a solution that works native on AWS EC2¹, to avoid having to manage it myself. I would like to be able to edit posts without requiring me a console and git client, and I would like to have an integrated view of the comments, instead of relying on Disqus², which at least a few people hate, and I don’t particularly enjoy.

For now, I guess I’ll have to be extra careful if I want to update Hugo. But at least I should be able to not break this again so easily as I’ll be checking the output before and after the change.

Update (2017-08-14): it looks like this blog post got picked up by Internet’s own peanut gallery that not only don’t seem to understand why I’m complaining here (how many SEOs there?), but also appear to have started suggesting with more or less care a number of other options. I say “more or less” because some are repeats or aimed at solving different problems than mine. There are a few interesting ones I may spend some time looking into, either this week or the next while on the road.

Since this post is over a month old by now, I have not been idle and started instead trying out WordPress, with not particularly stunning results either. I am though still leaning towards that option, because WordPress has the best upgrade guarantees (as long as you’re not using all kind of plugins) and it solves the reliance on Disqus by having proper commenting support.

¹ Update (2017-08-14) I mean running the stack, rather than pushing and storing to S3. If there’s a pre-canned EC2 instance I can install and run, that’ll be about enough for me.

² don’t even try suggesting isso. Maybe my comments are not big data, but at 2400+ blog posts, I don’t really want to have something single-threaded that access a SQLite file!

Let’s have a serious talk about Ads

I have already expressed my opinion on Internet Ads months ago, so I would suggest you all to start reading from that, as I don’t want to have to repeat myself on this particular topic. What I want to talk right now is whether Ads actually work at all for things like my blog, or Autotools Mythbuster.

I’ll start by describing the “monetization” options that I use, and then talk a bit about how much they make, look into the costs and then take a short tour of what else I’m still using.

Right now, there are two sources of ads that I use on this blog: Google AdSense and Amazon Native Ads. Autotools Mythbuster only has AdSense, because the Amazon ads don’t really fit it well. On mobile platform, the only thing you really see is AdSense, as the Native Ads are all the way to the bottom (they don’t do page-level ads as far as I can tell), on desktop you only get to see the Amazon ads.

AdSense pays you both for clicks and for views of the ads on your site, although of course the former gives you significantly higher revenue. Amazon Native Ads only pays you for the items people actually buy, after clicking on the Ads on your site, as it is part of the Amazon Affiliate program. I have started using the Amazon Native Ads as an experiment over April and May, mostly out of curiosity of how they would perform.

The reason why I was curious of the performance is that AdSense, while it used to mostly make decision on which ads to show based on the content of the page, it has been mostly doing remarketing, which appears to creep some people app (I will make no further comments on this), while the idea that Amazon could show ads for content relevant to what I talked about appealed to me. It also turned out to have been an interesting way to find a bug with Amazon’s acapbot, because of course crawlers are hard. As it turns out, the amount of clicks coming from Amazon Native Ads is near zero, and the buying rate is even lower, but it still stacks fairly against AdSense.

To understand what I mean I need to give you some numbers, which is something people don’t seem to be very comfortable with in general. Google AdSense, overall, brings in a gross between €3 and €8 a month, with a very few rare cases in which it went all the way up to a staggering €12. Amazon Affiliates (which as I’ll get to does not only include Native Ads) varies very widely month after month, as it even reaches $50. Do note that all of this is still pre-tax, so you have to just about cut it in half to estimate (it’s actually closer to 35% but that’s a longer story).

I would say that between the two sources, over the past year I probably got around €200 before tax, so call it €120 net. I would have considered that not bad when I was self-employed, but nowadays I have different expectations, too. Getting the actual numbers of how much the domains cost me per year is a bit complicated, as some of those, including flameeyes.eu, are renewed for a block of years at the same time, but I can give you makefile.am as a point of reference (and yes that is an alias for Autotools Mythbuster) as €65.19 a year. The two servers (one storing configuration for easy re-deployment, and the other actually being the server you read the blog from) cost me €7.36/month (for the two of them), and the server I use for actually building stuff costs me €49/month. This already exceeds the gross revenue of the two advertising platforms. Oops.

Of course there is another consideration to make. Leaving aside my personal preferences on lifestyle, and thus where I spend my budget for things like entertainment and food, there is one expense I’m okay with sharing, and that is my current standing donations. My employer not only makes it possible to match donations, but it also makes it very easy to just set up a standing donation that gets taken directly at payroll time. Thanks to making this very simple, I have a standing €90/month donation, spread between Diabetes Ireland, EFF and Internet Archive, and a couple others, that I rotate every few months. And then there are Patreons I subscribe to.

This means that even if I were to just put all the revenue from those ads into donations, it would barely make an impact. Which is why by the time you read this post, my blog will have no ads left on (Autotools Mythbuster will continue for a month or two just so that the next payment is processed and not left in the system). They would be okay to be left there even if they make effectively no money, except that they still require paperwork to be filed for taxes, and this is why I have considered using Amazon Native Ads.

As I said, Amazon Native Ads are part of their Affiliate program, and while you can see in the reports how much revenue is coming from ads versus links, the payments, and thus the tax paperwork, is merged with the rest of the affiliate program. And I have been using affiliate links in a number of places, not just my blog, because in that case there is no drawback: Amazon is not tracking you any more or less than before, and it’s not getting in your way at all. The only places in which I actually let Amazon count the impression (view) is when I’m actually reviewing a product (book, game, hardware device), and even that is fairly minimal, and not any different from me just providing the image and a link to it — except I don’t have to deal with the images and the link breakage connected with that.

There is another reason why I am keeping the affiliates: while they require people to actually spend money to get me anything, they give you a percentages of the selling price of what was sold. Not what you linked to specifically, but what is sold in the session that the user initiated when they clicked on your link. This makes it interesting and ironic when people click on the link to buy Autotools Mythbuster and end up buying instead Autotools by Calcote.

Do I think this experience is universal or generally applicable? I doubt so. My blog does not get that many views anyway, and it went significantly down since I stopped blogging daily, and in particular it went down since I no longer talk about Gentoo that much. I guess part of the problem with that is that beside for people looking for particular information finding me on Google, the vast majority of the people end up on my blog either because they read it already, or follow me on various social media. I have an IFTTT recipe to post most of my entries on Twitter and LinkedIn (not Google+ because there is no way to do that automatically), and I used to have it auto-post new entries that would go to Planet Gentoo on /r/gentoo (much as I hate Reddit myself).

There is also the target audience problem: as most of the people reading this blog are geeks, it is very likely that they will have an adblocker installed, and they do not want to see the ads. I think uBlock may even make the affiliate links broken, while at it. They do block things like Skymiles Shooping and similar affiliate aggregators, because “privacy” (though there is not really any privacy problem there).

So at the end of the day, there is no gain for me to keep ads running, and there is some discomfort for my readers, thus I took them down. If I could, I would love to be able to just have ads for charities only, with no money going to me at all, but reminding people of the importance of donating, even a little bit, to organizations such as Internet Archive, which saved my bacon multiple times as I fixed the links from this blog to other sites, that in the mean time moved without redirects, or just got turned down. But that’s a topic for another day, I think.

New blog, powered by Hugo

If you’re seeing this post, it means I have flipped the switch and started using Hugo as the engine for my blog. You may remember I almost turned off the blog, and I have indeed made the blog read-only in June, as Typo was misbehaving and I could not be bothered to fix it.

A few months ago I decided to make the blog read-only and disabled comments, it actually meant replacing the whole app with the output of the caching layer in Rails. Not great but it worked out for a while. In the meantime I have been thinking of what else I could do to make it easier to maintain the blog. At the moment, the best option I can think of is this: a static blog engine with Disqus for the commenting, which is not my favourite but could not find a better alternative.

Hugo works out fairly well, and I’ve settled on a modified version of the Strata theme. Some of the improvements I’ve made (and I’m making) I’ll send upstream, others might not so easily make it. I’m a bit surprised that Hugo does not, by default, try minimizing or merging the Javascript resources, so I might invest some time on that at some later point.

While I made sure that all the permalinks would match with what I had before, there are obvious things that changed place, such as the RSS feeds, and the monthly archives don’t exist anymore (did anybody ever use them?) plus I’m missing a CSE to search the blog. On the other hand I’m also missing ads, so your mileage may vary on whether this is an improvement or not.

As for comments, I think this is important to state outright, because I know someone will start complaining about the fact I settled on a closed, non-free service such as Disqus. I don’t like it either but it’s the best option I found in the short run. I do hope I’ll manage to get a better replacement at some point in the future, but the open-source alternatives I found appear to be the trend-du-jour.

Isso appears to be the most mature alternative, but it uses SQLite to store the comments, because «comments are not bigdata» completely ignoring the fact that SQLite is horrible for parallel accesses, which means the app is likely not going to keep up with a spam attack. The alternative from Hugo’s author relies instead on MongoDB and NodeJS, which is for a different reason horrible. I did see one written in PHP, but using XML files written directly, with the default instructions suggesting changing a directory to 777.

I particularly don’t like the fact that Disqus only allows getting a backup copy of the comments out up to a limit that they don’t disclose, with their instructions essentially suggesting that you should try requesting a backup, and if nothing mails you back twice, then you’re too big. Not a great trust I can put on that.

So anyway hopefully I’ll manage to write more interesting stuff soon, for now I hope I didn’t break too many links. There are also a few posts that likely have broken text due to having to convert all the two thousand posts written in Textile to Markdown with Pandoc and it not being completely accurate. If you see anything wrong, please just leave a comment and I’ll pick it up from there.

About the background picture: it might be worth noting, the picture I’m using is mine and I took it from my Air France flight to Shanghai, in September. Yes, the guy who was so afraid of flying is now traveling around the globe for work. Ironic?

Finding a better blog workflow

I have been ranting about editors in the past few months, an year after considering shutting the blog down. After some more thinking out and fighting, I have now a better plan and the blog is not going away.

First of all, I decided to switch my editing to Draft and started paying for a subscription at $3.99/month. It’s a simple-as-it-can-be editor, with no pretence. It provides the kind of “spaced out” editing that is so trendy nowadays and it provides a so-called “Hemingway” mode that does not allow you to delete. I don’t really care for it, but it’s not so bad.

More importantly it gets the saving right: if the same content is being edited in two different browsers, one gets locked (so I can’t overwrite the content), and a big red message telling me that it can’t save appears the moment I try to edit something and the Internet connection goes away or I get logged out. It has no fancy HTML editor, and instead is designed around Markdown, which is what I’m using nowadays to post on my blog as well. It supports C-i and C-b with it just fine.

As for the blog engine I decided not to change it. Yet. But I also decided that upgrading it to Publify is not an option. Among other things, as I went digging trying to fix a few of the problems I’ve been having I’ve discovered just how spaghetti-code it was to begin with, and I lost any trust in the developers. Continuing to build upon Typo without taking time to rewrite it from scratch is in my opinion time wasted. Upstream’s direction has been building more and more features to support Heroku, CDNs, and so on so forth — my target is to make it slimmer so I started deleting good chunks of code.

The results have been positive, and after some database cleanup and removing support for structures that never were implemented to begin with (like primary and hierarchical categories), browsing the blog should be much faster and less of a pain. Among the features I dropped altogether is the theming, as the code is now very specific to my setup, and that allowed me to use the Rails asset pipeline to compile the stylesheets and javascripts; this should lead to faster load time for all (even though it also caused a global cache invalidation, sorry about that!)

My current plan is to not spend too much time on the blog engine in the next few weeks, as it reached a point where it’s stable enough, but rather fix a few things in the UI itself, such as the Amazon ads loading that are currently causing some things to jump across the page a little too much. I also need to find a new, better way to deal with image lightboxes — I don’t have many in use, but right now they are implemented with a mixture of Typo magic and JavaScript — ideally I’d like for the JavaScript to take care of everything, attaching itself to data-fullsize-url attributes or something like that. But I have not looked into replacements explicitly yet, suggestions welcome. Similarly, if anybody knows a good JavaScript syntax highligher to replace coderay, I’m all ears.

Ideally, I’ll be able to move to Rails 4 (and thus Passenger 4) pretty soon. Although I’m not sure how well that works with PostgreSQL. Adding (manually) some indexes to the tables and especially making sure that the diamond-tables for tags and categories did not include NULL entries and had a proper primary key being the full row made quite the difference in the development environment (less so in production as more data is cached there, but it should still be good if you’re jumping around my old blog posts!)

Coincidentally, among the features I dropped off the codebase I included the update checks and inbound links (that used the Google Blog Search service that does not exist any more), making the webapp network free — Akismet stopped working some time ago and that is one of the things I want to re-introduce actually, but then again I need to make sure that the connection can be filtered correctly.

By the way, for those who are curious why I spend so much time on this blog: I have been able to preserve all the content I could, from my first post on Planet Gentoo in April 2005, on b2evolution. Just a few months shorts of ten years now. I also was able to recover some posts from my previous KDEDevelopers blog from February that years and a few (older) posts in Italian that I originally sent to the Venice Free Software User Group in 2004. Which essentially means, for me, over ten years of memories and words. It is dear to me and most of you won’t have any idea how much — it probably also says something about priorities in my life, but who cares.

I’m only bothered that I can’t remember where I put the backup from blogspot I made of what I was writing when I was in high school. Sure it’s not exactly the most pleasant writing (and it was all in Italian), but I really would like for it to be part of this single base. Oh and this is also the reason why you won’t see me write more on G+ or Facebook — those two and Twitter are essentially just a rant platform to me, but this blog is part of my life.

Cleanup chores

These last few days in San Francisco I was able to at least do some of the work I set myself out to do in my spare time, mostly on my blog. The week turned out to be much more full of things to do than I planned originally, so it turned out that I did not go through with all my targets, but at least a few were accomplished.

Namely, all my blog archives consistently links to the HTTPS versions of the posts, as well as the HTTPS version of my website – which is slowly withering to leave more space to the blog – and of Autotools Mythbuster on its new home domain. This sounds like an easy task but it turned out to be slightly more involved than I was expecting, among other things because at some point I used protocol-relative URLs. I even fixed all the links that pointed to the extremely old Planet Gentoo Blog, so that the cross-references are now working, even though probably nobody will read those posts ever again. I also made all the blog comments coming from me consistent by using the same email address (rather than three different ones) and the same website. Finally, I got the list of 404s as seen by GoogleBot and made sure that the links that were broken when posted out there pointed to the right posts.

But there have been a few more things that needed some housekeeping, and was related to account churn. For context, this past Friday was my birthday — and I received an interesting email from a very old games forum that I registered on when I was helping out with the NoX-Wizard emulator: a “happy birthday” message. I then remembered that most vBullettin/phpBB installs send their greetings to the registered user who opted in to provide their birthdate (and sometimes you were forced to due to COPPA). Then since there has been some rumors of a breach on an Italian provider which I used to use when I originally went online, I decided to go and change passwords – once again thanks LastPass ­– and found there two more similar messages for other forums, which I probably have not visited in almost ten years.

You could think that there is no reason to go and try to restore those accounts to life — and I would almost agree with you if it wasn’t that they pose a security risk the moment they get breached. And it should be obvious by now that breaching lots of small sites can be just as profitable as breaching a single big site, and much easier. Those forums most likely still had my original, absolutely insecure passwords, so I went and regenerated them.

I wonder how many more accounts I forgot about are out there — I know that for sure there are some that were attached to my BerliOS email address, which is now long gone. The other day using Baidu to look for myself I got remembered I had a Last.FM account which I now got access to again. At least using a password manager it’s more difficult to forget about accounts altogether, as they are stored there.

Anyway, for the moment this is enough cleanup, feel free to report if there are other things that I should probably work on, non-Gentoo related (Autotools Mythbuster is due an update but I have not had time to go through that yet); the slower Amazon ad on the blog will also be fixed, promised!

My ideal editor

Notepad Art
Photo credit: Stephen Dann

Some of you probably read me ranting on G+ and Twitter about blog post editors. I have been complaining about that since at least last year when Typo decided to start eating my drafts. After that almost meltdown I decided to look for alternatives on writing blog posts, first with Evernote – until they decided to reset everybody’s password and required you to type some content from one of your notes to be able to get the new one – and then with Google Docs.

I have indeed kept using Google Docs until recently, when it started having some issues with dead keys. Because I have been using US International layout for years, and I’m too used to it when I write English too. If I am to use a non-deadkeys keyboard, I end up adding spaces where they shouldn’t be. So even if it solves it by just switching the layout, I wouldn’t want to write a long text with it that way.

Then I decided to give another try to Evernote, especially as the Samsung Galaxy Note 10.1 I bought last year came with a yet-to-activate 12 months subscription to the Pro version. Not that I find anything extremely useful in it, but…

It all worked well for a while until they decided to throw me into the new Beta editor, which follows all the newest trends in blog editors. Yes because there are trends in editors now! Away goes the full-width editing, instead you have a limited-width editing space in a mostly-white canvas with disappearing interface, like node.js’s Ghost and Medium and now Publify (the new name of what used to be Typo).

And here’s my problem: while I understand that they try to make things that look neat and that supposedly are there to help you “focus on writing” they miss the point quite a bit with me. Indeed, rather than having a fancy editor, I think Typo needs a better drafting mechanism that does not puke on itself when you start playing with dates and other similar details.

And Evernote’s new editor is not much better; indeed last week, while I was in Paris, I decided to take half an afternoon to write about libtool – mostly because J-B has been facing some issues and I wanted to document the root causes I encountered – and after two hours of heavy writing, I got to Evernote, and the note is gone. Indeed it asked me to log back in. And I logged in that same morning.

When I complained about that on Twitter, the amount of snark and backward thinking I got surprised me. I was expecting some trolling, but I had people seriously suggesting me that you should not edit things online. What? In 2014? You’ve got to be kidding me.

But just to make that clear, yes I have used offline editing for a while back, as Typo’s editor has been overly sensible to changes too many times. But it does not scale. I’m not always on the same device, not only I have three computers in my own apartment, but I have two more at work, and then I have tablets. It is not uncommon for me to start writing on a post on one laptop, then switch to the other – for instance because I need access to the smartcard reader to read some data – or to start writing a blog post while at a conference with my work laptop, and then finish it in my room on the personal one, and so on so forth.

Yes I could use Dropbox for out-of-band synchronization, but its handling of conflicts is not great, if you end up having one of the devices offline by mistake — better than the effets of it on password syncs but not so much better. Indeed I have bad experiences with that, because it makes it too easy to start working on something completely offline, and then forget to resync it before editing it from a different service.

Other suggestions included (again) the use of statically generated blogs. I have said before that I don’t care for them and I don’t want to hear them as suggestions. First they suffer from the same problems stated above with working offline, and secondly they don’t really support comments as first class citizens: they require services such as Disqus, Google+ or Facebook to store the comments, including it in the page as an external iframe. I not only don’t like the idea of farming out the comments to a different service in general, but I would be losing too many features: search within the blog, fine-grained control over commenting (all my blog posts are open to comment, but it’s filtered down with my ModSecurity rules), and I’m not even sure they would allow me to import the current set of comments.

I wonder why, instead of playing with all the CSS and JavaScript to make the interface disappear, the editors’ developers don’t invest time to make the drafts bulletproof. Client-side offline storage should allow for preserving data even in case of being logged out or losing network connection. I know it’s not easy (or I would be writing it myself) but it shouldn’t be impossible, either. Right now it seems the bling is what everybody wants to work on, rather than functionality — it probably is easier to put in your portfolio, and that could be a good explanation as any.