This can seem like some strange and absurd thing, but bsdtar on Linux is quite an interesting topic.
As many of you know, I’m one of the Gentoo/FreeBSD developers; one of the things we want to be sure that is not changed when we use Gentoo/FreeBSD respect normal FreeBSD is the userland, not only the libc but also the utils. That’s why we avoid replacing BSD utils and BSD libraries with version not used by default FreeBSD installations.
For this, we usually prefix ‘g’ to the comamnd names when they collides with BSD’s commands, so we have gsed gmake gtar gm4 and so on.
But another interesting thing to do is using BSD userland on Linux, and this is quite always possible without problems. The bsdtar package is one of the best examples of that.
bsdtar handles few file formats which aren’t managed by GNU tar, like zip files or cpio files, and handles them directly without spawn bzip2/gzip process and then another tar.
This makes a slight improvement in the time needed to extract things:
flame@enterprise ~/test $ time tar xf /var/portage/distfiles/kdelibs-3.4.1.tar.bz2
real 0m17.072s
user 0m5.763s
sys 0m1.190s
[ in the mean time the directory was removed ]
flame@enterprise ~/test $ time bsdtar xf /var/portage/distfiles/kdelibs-3.4.1.tar.bz2
real 0m14.421s
user 0m5.581s
sys 0m1.288s
Code language: JavaScript (javascript)
and also the size is in favour of bsdtar:
flame@enterprise ~ $ ls -l `which tar` `which bsdtar`
-rwxr-xr-x 1 root root 266568 mar 18 19:34 /bin/tar
-rwxr-xr-x 1 root root 171392 giu 16 00:39 /usr/bin/bsdtar
Code language: JavaScript (javascript)
(note: dates are expressed in Italian even if I have LANG=“en_US”, don’t ask me why).
bsdtar and GNU tar shares the same basic syntax, also if they aren’t compatible for strange extensive syntax which is not POSIX anyway.
If you want to try it, it’s in portage as app-arch/bsdtar, it doesn’t overwrite your tar as it’s installed as bsdtar (and then linked to tar if $USERLAND == "BSD"
) so it’s safe. I’m wondering if there’s a way to provide a selection of tar tools to use, as portage uses a quite portable syntax which doesn’t seem to have problems with bsdtar anyway.
I hope someone will find that useful.. I for one I’ve set alias tar=bsdtar in my bashrc file.
P.S.: obviously linking tar
to bsdtar
isn’t supported and nobody will ever support you if you do that, so please really avoid that, thanks.
LC_* that is null or unset is set to LANGLC_ALL overides them all
unset LC_ALL and use LC_TIME=POSIX 🙂
Did you take disk caching by the kernel into account while doing these benchmarks?I tested this with multiple accesses, and it seems that although bsdtar has a slight advantage, it’s not as good as shown.The tests were done on an Athlon XP 1800+, IDE drive, and these are the results:I) Preliminary read into the cache: $ ls -l kdegames-*.tar.bz2 | awk ‘{print $5, $9};’9311008 kdegames-3.3.1.tar.bz2$ time bsdtar tf kdegames-3.3.1.tar.bz2 # bsdtar was tried first, to handicap it.real 0m14.074suser 0m5.146ssys 0m0.124s$ time tar tf kdegames-3.3.1.tar.bz2real 0m14.872suser 0m5.331ssys 0m0.212sII) Actual etracting, with the command:$ time $TAR xf kdegames-3.3.1.tar.bz2The order was BSD1, GNU1, BSD2, GNU2, BSD3, and the results:test real user sys———————BSD1 12.819 5.123 1.223GNU1 8.495 5.445 1.622BSD2 7.288 5.108 0.981GNU2 8.911 5.472 1.482BSD3 7.756 5.119 1.022So, I guess the claim that bsdtar is faster stands, but it also shows the effect of caching and other system-related activities. Note that in the first run, gnu tar shaves more than 4 seconds off the bsdtar time, which is more than the advantage that bsdtar has in your test, on a smaller file (but maybe a slower system).Also, it’s nice to know that tar doesn’t require the -j flag anymore 🙂