Following another lscube problem discussion, here comes another post of the series “why do I prefer one solution over another”, that I started thinking about after I ripped off the plugins interface from the current feng master branch (to be developed separately and brought back to the master once the interface is properly defined, and chosen and so on).
While ripping off the plugin I had to deal with the configuration file parser that we have in feng right now, which was heavily borrowed out of lighttpd. Luca’s idea has been that, since the scope of lighttpd and feng are quite similar, we could share the same configuration file syntax, and even code. Unfortunately, this ended up having a couple of problems: the first is that, as usual for Luca’s idea, they look so much forward that we only implement a small percentage of the right now: vhost, conditionals and all that stuff is something we not even barely start scratching them; the other is that the code used for the parser has changed quite a bit in lighttpd in the mean time; I think the original one was hand-tailored, then they moved to lemon-based parsing (the parser used by SQLite3) and now they are toying with Ragel to rewrite the mess that it’s now.
But in particular why is the current system a mess? Well the first problem is not really in the syntax itself, but rather n the current (lemon-based, I think) implementation: it uses a big table, which is relatively fine, but then it also references its entries with their direct array index, which means any change in the list of entries require to change all the following ones; not the nicest piece of code I had to deal with.
The second problem is also intrinsic in the way the code was written, partly by hand: there are some objects implemented for the configuration file parsing, like high-level pointer arrays and strings, which reinvent what glib already provides. These objects increase the size and complexity of our code with the only good reason that lighttpd does not use glib and thus have a need for them. Decoupling the configuration file code from these objects is non-trivial, and will diverge the code to the point we wouldn’t be able to share it with lighttpd any longer.
Now there is a certain complexity tied to the way we want the configuration file to behave: we want to have conditionals and blocks, and includes, similar to lighttpd and Apache configuration files, like most software that deals with servers and locations. The simplest configuration file formats don’t work very well for this task (for instance I don’t like the way Cherokee configuration files are written; they are designed to work with a web administration package, rather than hand-edited by an human). This kind of solution usually gets solved in two main ways: either by inventing your own syntax or using something like XML for the configuration file.
While XML makes it relatively easy to parse a configuration file (relatively because it’s just moving the issue of complexity; parsing XML is not usually faster than other configuration file syntax, it’s actually slower and much more complex; but generic XML parsers exist already that make the parsing easier for the final software), it definitely makes it obnoxious to read and edit by hand, which is why I don’t think XML is a good format for configuration files.
My favourite format for configuration file is the “good old” INI-like file format; I call it INI-like but I admit I’m not sure if Windows’s INI files were the ones using it first; on the other hand, the same format is nowadays used by so much stuff, including freedesktop’s .desktop
files, and there are multiple well-tested parsers for it, including some (very limited, to be honest) support in glib itself.
Now, it’s obviously a no-brainer that the INI format is nowhere near powerful as the language used by Apache configuration files, or by lighttpd, since it bears no conditionals, and only has structure in the sense of sections, variables and values. But it’s exactly in this kind of simplicity that I think it might work well for feng nonetheless, given it works for git as well.
While the lighttpd syntax contains all kind of conditionals that allow setting particular limits based on virtual host, location, and generic headers, it’s quite likely that for feng we can reduce this to a tighter subset of conditions.
What we already have some small predisposition for is per-virtualhost settings, which are handled with an array (right now a single entry) of specific configuration structures, each of which is initialised first with the default value of the configuration and then overridden with the loaded configuration, and then support for some global non-replaceable settings. How does that fare with an INI-like configuration file? Well, to me it’s quite easy to think of it in this terms:
[global]
user = feng
group = feng
logdir = /var/log/feng
errorlog = error.log
sctp = on
listen = 0.0.0.0:554 localhost:8554
[default]
accesslog = access.log
[localhost]
accesslog =
accesslog_syslog = on
[127.0.0.1]
alias = localhost
[my.vhost.tld]
accesslog = my_access.log
[his.vhost.tld]
accesslog = /var/log/feng/another_access.log
This obviously is just a quick draft of how it could be achieved and does not mean that it should be achieved this way at all; but it’s just to say that maybe we should be looking into some alternative, even though keeping syntax-compatibility with lighttpd is a nice feature, it shouldn’t condition the extra complexity we’re currently seeing!