12 Nov 2012 redi   » (Master)

Gold linker + CentOS5 NFS client + Solaris 10 NFSd = ballache

I've spent a day and a half being completely bewildered by a weird NFS bug where ELF binaries (but not other files) written to an NFS mount show up on remote hosts with the correct file size but consisting entirely of nul zero bytes, but only when written from CentOS5 hosts, not from Solaris, Fedora or RHEL6 hosts.

I eventually narrowed it down to the Gold linker, which writes files using mmap, and the CentOS5 2.6.18 kernel has a bug when writing files with mmap to NFS mounts.

There was a very similar RHEL4 bug that should be fixed in my kernel, but for some reason the kernel-2.6.18-redhat.patch file in the SRPM comments out the fix. I don't know why.

Maybe this post will show up for anyone else searching for the symptoms, because I didn't have much luck searching the web for it.

My solution is to avoid Gold on CentOS5 (since we can't easily stop using NFS, unfortunately) but I wish I could get that day of my life back.

Update: The distcc FAQ (search for Files written to NFS filesystems are corrupt) mentions this problem and refers to a post to the distcc list and a post to the linux-nfs list where a workaround using the no_subtree_check option for nfsd is given, but that assumes the NFS server is linux, and mine isn't

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!