Who wants to support largefile?

This post is inspired by a post of Eric Sandeen, whose blog I read last night after discovering we share an interest in making software build in parallel.

A little background for those who don’t know the issue I’m going to talk about. Classically, inode numbers and offsets were 32-bit values, but as you might guess nowadays this cannot be true, files bigger than 2GB (the highest offset that 32-bit can represent) are quite common, just think of DVD images, or even better of BluRay disks, 50GB are huge), and modern filesystems (as Eric points out: XFS, btrfs and ext4) have or might have 64-bit inode numbers. Since changing the size of types would have broken ABI compatibility, GNU libc, as well as other libraries, added support for the so-called “largefile” mode. In largefile mode, the standard file operations have types with 64-bit size. The way this is implemented is by replacing calls like open() or stat() with 64-bit variants, called open64() and stat64(). Other operating systems like FreeBSD broke ABI compatibility and only have 64-bit interfaces. On new systems that are natively 64-bit, like AMD64, the new 64-bit interface is enabled by default, so the 64-bit specific interface is not needed.

Now since the two interfaces are, well, different interfaces, the only moment when they can be switched is at build time, indeed, you need to pass some compiler defines so that it replaces he calls at buildtime, and thus make use of either the old or the new largefile interface. Most packages you can think of are probably using largefiles already, some conditionally, some unconditionally as needed, and some unconditionally, needed or not just to be safe. The problem is that not all software can deal with largefile properly as it is.

The usual way to discover a package does not support largefile is watching it fail on a >2GB file. The problem is that it’s not so nice since it means you have to fix the problem when it becomes a problem, while it would be much better to be able to identify the problem earlier, so that it can be solved before it becomes a true problem. But Eric’s post has given me an idea; I asked him for the script (which you can find attached to this post if Typo is not going to do some funny thing update: I finally was able to make lighttpd serve the script; for once Typo was innocent) and I used the same logic to identify packages using 32-bit interfaces with scanelf after portage installs it.

This is not yet a complete test since I’m forcing it to work only on x86 systems (I wanted to exclude AMD64), and it only checks stat symbols, it should check open, read write and all the other symbols too. More importantly, this is not going to work with the scanelf that you got installed by portage right now (0.1.18), since I had to fix it a bit to properly handle regexp matching and multiple symbols matching. So if you want to try this you’ll probably have to wait till I release a 0.1.19 version. At any rate, the code in the bashrc file is just the following, for now:

post_src_install() {
    scanelf -q -F "#s%F" -R -s '-__xstat,-__lxstat,-__fxstat' "${D}" > "${T}"/flameeyes-scanelf-stat64.log
    if [[ -s "${T}"/flameeyes-scanelf-stat64.log ]]; then
    ewarn "Flameeyes QA Warning! Missing largefile support"
    cat "${T}"/flameeyes-scanelf-stat64.log >/dev/stderr
    fi
}

Please don’t rush submitting bugs for these things though; these are useful to know and they should probably be fixed, but please send the patches upstream rather than directly to Gentoo, for now.

8 thoughts on “Who wants to support largefile?

  1. -Sorry I meant inode number, rather than inode there, and- *Actually, I did write inode number, as I was thinking.* of course it has nothing to do with file sizes, but it’s much more easy to figure the problem with file size-related examples than with inode numbers.The point is that both features are enabled in the same way.*And just to be clear, the inode number is a unique identifier of a given file metadata on the filesystem.*

    Like

  2. Mm, I think you’re still confused. Inodes don’t have anything to do with it. There’re some good clear diagrams of the kernel data structures in “Advanced Programming in the Unix Environment” — probably worth checking it out.

    Like

  3. While I’d love to read that book (hey you’re free to give me a copy if you wish ;)), I think _you_ are the confused one.When I talk about “inode number” I refer to @struct stat@’s @st_ino@ member, which from the very @stat(2)@ man page is called “inode number”.On 32-bit systems with no largefile support enabled, on Linux, st_ino is 32-bit (sizeof reports 4), on 32-bit systems with largefile and 64-bit systems, st_ino is 64-bit (sizeof reports 8). XFS is an example of filesystem that _can_ use 64-bit inode numbers for instance (and here you can see why it has so many more inodes per filesystem than ext3 allows).<typo:code> % cat st_ino_test.c #include <sys types.h=””>#include <sys stat.h=””>#include <stdio.h>int main() { struct stat stat; printf(“%zdn”, sizeof(stat.st_ino));}% gcc -m64 st_ino_test.c -o st_ino_test1% gcc -m32 st_ino_test.c -o st_ino_test2 % gcc -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -m32 st_ino_test.c -o st_ino_test3% ./st_ino_test18% ./st_ino_test24% ./st_ino_test3 8</typo:code>Savvy?

    Like

  4. st_ino has nothing to do with the file size. Nor does its size have much bearing on the commonly encountered ext3 inode limits, which are to do with how ext3 allocates inodes at mkfs time — by default they are much lower than even a 32 bit field would allow, because on ext3 lots of inodes means lots of overhead.You’re being confused by a couple of coincidences in the sizes of fields chosen by Linux. The field you should be looking at is st_size, along with the off_t parameters of calls like lseek (you are remembering to check lseek calls along with stat, right?). The increase in size in st_ino is merely a clever way of avoiding a fourth round of breakage when people start needing more inodes.

    Like

  5. I did tell you at the first comment that inode numbers have nothing to do with size, please get back to read that. And I repeat it, of course inode numer size has nothing to do with file size, they just both *happens* to be extended with largefile support.As for checking lseek, no I’m not, and if you read the blog post instead of nitpicking on things you can’t even read straight, I did say that this is not ready for primetime yet. When it’ll be (together with a new enough pax-utils) I’ll ask Zac to add it to Portage’s QA warnings since we expect software to be built with largefile support for the most part.Are you just jealous you didn’t have the idea of checking for this first?

    Like

  6. I am merely curious as to why your original post contained so many mentions of ‘inode’, and why you felt the need to discuss st_ino in any way, since none of it has any bearing upon large file support. Still, I’m glad to see I’ve managed to help you learn a little about Unix basics.

    Like

  7. You didn’t really have to help me with my Unix basics, thank you. It would be like helping Patricia Cornwell to write a detective story.Also,I’m afraid I’m still unable to help you learn to read. If you were to read the blog from its incipit you’d know *why* the original code refers to st_ino.But I guess that can’t be helped, can it?

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s