Some of you might have already read about my personal ruleset that I developed to protect my blog from the tons of spam comments that it receives daily. It is a set of configuration files for ModSecurity for Apache, that denies access to my websites to crawlers, spammers and other malicious clients.
I was talking with Jean-Baptiste of VLC fame the past two days about using the same ruleset to protect their Wiki, which has even worse spam problems than my blog. Judging from the logs j-b has shown me, my rules already cover most of the requests he’s seeing (which is a very positive note for my ruleset); on the other hand, configuring their web host to properly make use of them is proving quite tricky.
In Gentoo, when you install ModSecurity you get both the Apache module, with its basic configuration, and a separate package with the Core Rule Set (CRS). This division is an idea of mine to solve the problem of updating the rules, which are sometimes updated even when the code itself is unchanged — that’s the whole point of making the rules independent of the engine. By using the split package layout, the updater script that is designed to be used together with ModSecurity is not useful on Gentoo so it’s not even installed — even though it is also supposedly flexible enough that I could make it usable with my ruleset as well.
In Debian, though, the situation is quite more complex. First of all there is no configuration installed with the
libapache-mod-security package, which only installs the file to load the module, and the module itself. At a minimum, for ModSecurity to work you have to configure the
SecData directive, and then give it the set of rules to use. The CRS files, including the basic configuration files, are installed by the Debian packages as part of the documentation, in
I’ve now improved the code to provide an init configuration file that can be used without CRS.. but it seriously makes me wonder how can Debian admin deal with ModSecurity at all.
Finally, a consideration: the next version of ModSecurity will have support for looking posted URLs up in the Google Safebrowsing database, which is very good as an antispam measure.. I have hopes that either the next release or the one after will also bring Project Honey Pot http:BL support, given the Apache module was totally messed up and was unusable. That would make it a sweet tool to block crawlers and spammers!