A few days before leaving for my vacation (more on that later), I have noticed an identi.ca post from tante that related to XML parsers’ vulnerabilities from CERT-FI. Since I was leaving for vacation I didn’t want to pick it up myself, but I nudged our security team in that respect. Unfortunately this was a preamble to a multi-level fuck up.
When I first saw the advisories, it didn’t even name expat, not even in passing, but it referred to the Python parser, and I remembered that Python used an internal copy of expat by default. So I was worried; the worry seems to have been correct: the bug is in the expat code, rather than in the glue to Python, so the bug is present in all software using expat; Robert was able to reproduce the issue with a software that only used libexpat, and not Python; CERT-FI at the time of writing does not list standalone expat in the list of vulnerable software, though, just listing “Python’s libexpat”.
Indeed, the fix is present in the latest revision of expat in Gentoo’s tree, but that fix was escalated and pushed without going properly through security up to today which would have meant the fix wouldn’t have been scheduled for security stable.
But there is an even greater fuck-up in all this, and those who probably follow me from some time ago are already expecting it: bundled copies of libexpat ! Indeed the thing is not only bundled in a bunch of closed-source software but also in a lot of free software packages. The bundled libs bug is a good index for those things.
The problem now is to make sure the list is updated, and also make sure that the proprietary software that is vulnerable will be handled properly, hopefully. Unfortunately expat is probably the second most commonly bundled library after zlib, and that already made me shiver more than a few times at the thought of a vulnerability in it. Well, time has come.
Now, can somebody really find the unbundling effort still pointless? Seriously?
I was just discussing library bundling with the laconica developers yesterday, coincidentally. Right now, the laconica distribution has tons of PHP libraries bundled with it, so users can install laconica without installing the dependent packages (because hosting providers don’t let their customers install packages usually).Bundled packages really suck. I plan on thinking of something ingenious to fix the issue in laconica… *sigh*
The Debian security team keeps an index of embedded code copies BTW:http://svn.debian.org/wsvn/…
@pabs yes I knew about that list; I guess one problem there is that the list is very Debian-specific and not complete either for you I’m afraid. I also know that another security team (non distro-related) was working on trying to indexing the code copied around…I was already thinking some time ago to start a project to list these things; unfortunately I don’t think free-form is going to help. We’d be needing some structure to say “project A embeds project B; forked it or not” “project C took code from D (here and there)”, with cross-reference (so you could look for all the packages copied from E or all the ones that has been copied by F).Maybe I can come up with some XML format that translates to DocBook and from there to (X)HTML, then we could set a cross-distribution repository and keep it updated…
Disclaimer: I’m not currently a programmer. I used to program in a dead, legacy environment (don’t ask).As I understand your comments, you’re thinking of some kind of tree structured index of all embedded copies of expat with versioning. While this is an appropriate ideal and I like it, perhaps a different view might be more immediately appropriate.From a user perspective, it’s probably desirable to be able to identify all packages with some embedded version of expat both in open sourced and proprietary packages. The nominal brute force approach is simply to ‘grep’ all of one’s executables for strings which might indicate the presence of expat.Given compaction and compiled code obscurement tricks, a real tool would need to be more elegant. I imagine a small set of identifying strings and compiled code fragments combined with calls to pertinent decompressing routines would be needed to generate a list of all packages on a user’s system where expat might be found.Such a tool would be a good place to start in order to define what cross-distribution repository should include.I vividly recall all the packages we (the company I was working at the time) needed to update when the zlib fracas occured. I thought then that such a tool would be helpful and I still think so.Even better would be a generalized version where a user defined set of stings and compiled code fragments could be created so a user could check for the use of other embedded libraries. We were surprised at some of the places where zlib turned up.Best regards and thank you the care you take in your support efforts.Guy
And in a real nice world distributions would pool together and pressure upstream projects to remove bundled libs or at least offer the option to use the system copy.If a distribution like debian demands this as a prerequisite for entering stable there’s a lot of motivation for many projects….