bash scripting tiny details

Although I’m now an ebuild developer for almost two years, and I contributed for at least another year through bugzilla, I never considered myself a bash expert; the functions I use are mostly the generics, a bit more advanced than a newbie usage, as often needed in ebuilds, so from time to time, when I learn some new trick that others known since ages before, or I discuss about alternatives with other developers, I usually end up posting here trying to share it with others that might find it useful.

As autoepatch is being written entirely in bash, I end up coping with some problems or corner cases that I need to dig around and thus I ended up learning some more tricks, and some things I’m thinking about for the ebuilds themselves.

The first thing, is about sed portability.. we already have made sure that “sed” called in ebuild scope is always GNU sed 4, so that the command lines supported are the same everywhere; a portable alternative, and seems also to be a faster alternative: perl. The command “perl -p -i -e” is a workalike replacement for “sed -i -e”, that as far as I can see is also faster than sed.. I wonder if, considering we already have perl in base system, it would be viable to use it as an alternative to sed throughout the Portage tree.

For find(1) we instead rely on a portable subset of commands, so that we don’t ask Gentoo/*BSD users to install GNU findutils (that also often breaks on BSD too); one of the most used features of find in ebuilds is -print0 to then run through xargs to run some process on a list of files. Timothy (drizzt) already suggested some time ago to use -exec cmd {} + instead, as that merges the xargs behaviour in find itself, avoiding one process spawning and a pipe. Unfortunately, this feature, designed in SUSv3, is present in FreeBSD, DragonFlyBSD and NetBSD, but not on OpenBSD… for autoepatch (where I’m going to use that feature pretty often as it all comes down to find to, well, find the targets) I decided that the find command used has to support that feature, so to run on OpenBSD it will have to depend on GNU findutils (until they implement it). I wonder if this could be told of the whole Portage and then replace the many xargs calls in ebuilds…

I should ask reb about this latter thing but he is, uh, disappeared :/ seems like Gentoo/OpenBSD is one of those projects where people involved end up disappearing or screaming like crazy (kinda remind me of PAM).

Talking about the MacBookPro: today I prepared an ebuild for mactel-sources in my overlay, that I’m going to commit now, that takes gentoo-sources and applies the mactel patches over it, it’s easier to handle for me in the long run; this way, the Synaptics driver for the touchpad actually worked fine. Unfortunately, KSynaptics made the touchpad go crazy, so I suggest everybody NOT to try it, as it is now.

3 thoughts on “bash scripting tiny details

  1. One reason for doing ‘find <blah> | xargs cmd’ is that it dramatically reduces the number of commands executed. With ‘find <blah> -exec <cmd> {} ;’, <cmd> is executed once for every single result. With the xargs approach, <cmd> is executed with as many results as fit into a command line.Compare and contrast:

    cd /usr/portagefind . -name '*.ebuild' -exec grep -l 'inherit.* php' {} ;

    1 find, 23817 grep – 23818 processes in total.

    cd /usr/portagefind . -name '*.ebuild' | xargs grep -l 'inherit.* php'

    1 find, 1 xargs, 9 grep – 11 processes in total.So it can save a lot of process creation overhead, where there are many results. Whether to use xargs or not depends on the number of results you expect to find (and whether the command you want to run accepts more than one file parameter of course).

    Like

  2. That should have read:With ‘find -exec <cmd> {} ;’, <cmd> is executed once for every single result. With the xargs approach, <cmd> is executed with as many results as fit into a command line.

    Like

  3. Kevin, I was referring to find -exec <cmd> {} + rather than ;, that has the same behaviour of xargs (unless you’re action on a very big amount of files because it could go over the maximum number of arguments for the OS.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s