Redirecting URLs after Typo update

This is a note post that might be useful to others, since I had to fight with this myself. Since I moved to the new version of Typo (the blog engine that powers this blog, and which is in turn powered by Ruby on Rails), there has been a huge amount of redirection hits on the server. The reason is that the application stopped using the /articles path prefix on all the pages, and moved around the feeds (Atom and RSS). While Typo does take care of those requests by itself, it seems to be far from optimal, since it’s hitting the Ruby code just to redirect an URL, and seemed to keep the load average quite high. So I added this code in the list of redirections I had already (more on this later in this post):

url.redirect += (
    # Convert the old URL scheme to the new one at lighttpd level to avoid
    # hitting Rails' redirect controller (much slower than this)
    "^/xml/(atom?|rss)(10|20)?/(category|tag)/(.*)/feed.xml.*" => "/$3/$4.$1",
    "^/xml/(atom?|rss)(10|20)?/feed.xml.*" => "/articles.$1",
    "^/xml/(atom?|rss)(10|20)?/comments/feed.xml.*" => "/comments.$1",
    "^/articles/(.*)$" => "/$1",
)

This will translate the URLs directly in lighttpd so that the call won’t hit FastCGI, Ruby, Rails and all the rest.

I would sincerely start to think about moving to cherokee if it was well supported by webapp-config (for bugzilla); while I hate the configuration file of cherokee with all its exclamation marks, lighttpd sometimes seems quite silly to me, like mod_access only being able to deny access by trailing strings. What if I want to deny access to a subtree? (like, for instance /trackback or /admin on a non-SSL version of the server).

And keeping a map that translates broken URLs into working URLs is a mess, and I’m forced to keep one since OSGalaxy not only stills miswrites my name, but also truncates the URLs to my actual blog entries. I started noticing this through Google’s Webmaster Tools control panel since it reported 404 on URLs that I knew were broken; took me a while to find where these URLs came from. In the mean time I have this huge redirection table that I update every three days with the new scan results from Google…