After spending a day working on the new website I’ve been able to identify some of the problems and design some solutions that should produce a good enough results.
The first problem is that the original site did not only use PHP and a database, but also misused them a lot. The usual way to use PHP to avoid duplicating the style is usually to get a generic skin template, and then use one or more scripts per page that gets included in the main one depending on parameters. This usually results in a mostly-working site that, while doing lots of work for nothing, still does not hog down the server with unneeded work.
In the case of xine’s site, the whole thing loaded either a static html page that gets included or a piece of php code that would define variables, which, care of the main script, would then be replaced in a generic skin template. The menu would also not be written once, but generated on the fly for each page request. And almost all the internal links in the pages would be generated by a function call. Adding the padding to the left side of the menu entries for sub-pages was done by creating (with PHP functions) a small table before the image and text that formed the menu link. In addition to all this, the SourceForge logo was cycling on a per-second basis, which meant that an user browsing the site would load about six different SourceForge images in the cache, and that no two request would have got the same page.
The download, release, snapshots and security pages loaded the data on the fly from a series of flat files that contained some metadata about them, and that then produced the output you’d have seen. And to add client-side timewaste to what was already a timewaste on the server side, the changes in shade of the left-handed menu were done using JavaScript rather than the standard CSS2 :hover
option.
Probably because of the bad way the PHP code was written, the site had all the crawlers stopped by robots.txt
, which is a huge setback for a site aiming to be public. Indeed, you cannot find it on Google’s cache system because of that, which meant that for last night I had to work with the WayBack machine to see how the site appeared earlier. And it was from one year ago, not what we had a few weeks ago. (This has since stopped being a problem since Darren gave me a static snapshot of the thing as seen on his system).
To solve these problems I decided a few things for the new design. First of all as I’ve already said it has to be entirely static after modification, so that the files served are just the same for each request. This includes removing visit counters (who cares nowadays, really), and the changing SourceForge logo. This ensures that crawlers and users alike will see the exact same content over time if it doesn’t change, keeping caches happy.
Also, all the pages will have to hide their extensions, which mean that I don’t have to care whether the page is .htm
, .html
or .xhtml
. Just like my site all the extensions will be hidden so even the switch to a different technology will not invalidate the links. Again this is for search engine and users alike.
The whole generation is done with standard XSLT, without implementation-dependent features, which means it’ll work with libxslt just like with Saxon or anything else. Although I’m going to use libxslt for now since that’s what I’m using for my site as well. By using standard technologies it’s possible to reuse them for the future without relying on versions of libraries and similar. And thanks to the way XSLT has been designed, it’s very easy to decouple the content from the style, which is exactly what a good site should do to be maintainable for a long time.
Since I dislike custom solutions, I’ve been trying very hard to avoid using custom elements and custom templates outside the main skin, the idea is that XHTML usually works by itself, and adding a proper CSS will take care of most of the remaining stuff. This isn’t too difficult after you get around the problem that the original design was entirely based upon tables rather than proper div elements, but the whole thing has been manageable.
Besides, with this method adding a dynamically-generated (but statically-served) sitemap is also quite trivial, since it’s just a different stylesheet applied over the same general data for the rest of the site.
Right now I’m still working on fixing up the security page, but the temporary not-yet-totally-live site is available for testing and the repository is also accessible to see the code if you wish to see how it’s actually implemented. I’ve actually made some sophistication to the xine site I didn’t use for my own, but that will come with time.
The site does not yet validate properly, but the idea is that it will once it’s up, I “just” need to get rid of the remaining usage of tables.