Like having to rebuild one’s /usr/lib64 tree helps to learn that there are quite a few duplicated files installed in a system.
The first thing I have to suggest to anybody who happen to have my problem is: make sure you remove the debug info files before starting the procedure: they are big and a lot, and if you, like me, still have partial directories present, it’s simple to find them and remove them altogether, will save you from a lot of md5sum calls. The script I’m using (actually it’s a oneliner, albeit having two while statements in it) is still a test run with echo rather than the commands themselves, but when I’ll be sure it works as intended, I’ll see to post it here, in case someone else might need it.
Then there are the tricky parts: the script being what it is, it will create a bit of a stir when a given file is present with the same md5sum in different places. This is easier to see with empty files or files containing just ‘n’ having the same MD5SUM (.keep files are the most common offenders on this); to avoid having to copy those files back all over (especially since mtime will be changed, and that is bad), I’ve added a simple
-size +1 to skip over files of 1 byte or less. Hopefully should take care of it.
But of course there are duplicated files. PHP is a major offender on this: not only it installs a copy of
config.sub files, it also have some duplicated libpcre header files, but the absolute winner of the “let’s bloat a system” contest is vmware-server, as it comes with a copy of Perl itself, and some of the files are the same to the MD5SUM!
In addition to this, my script shown that there are packages installing stuff in /usr/lib when they shouldn’t. The multilib-strict warning usually allows to find these packages, but in the case of xc, for instance, there are no arch-dependent files, so multilib-strict does not trigger (obviously). It is not really a problem, as arch-independent files are fine in /usr/lib, but as far as I can see, those files should instead go to /usr/share/xc.
* scribbles something on his TODO list about this *