Older blog entries for berend (starting at number 397)

28 Mar 2014 (updated 3 Apr 2014 at 03:28 UTC) »

Really weird issue yesterday trying to move a customer to AWS. Testing was all fine, but when we switched the ip address, the system grinded to a halt.

The cause was the NFS server, which became unresponsive, so web and php5 farms stopped. Using iotop I found out that this was caused by the jdb2 process, jbd2/xvda1-8 in my case. jdb2 basically was at 100% i/o. Initially I thought perhaps the instance was faulty, so build a new NFS server (simply replaying my ansible script). Got exactly same behaviouron the new server, all i/o grind to a halt as jbd2 took over as soon as I did even simple things like checking out a Drupal repository (so single client, doing an svn co).

But why would jbd2 kick in? With iostat -x 1 I determined that we were writing 5MB/s to the root file system. That made no sense. There is nothing on this NFS server that would do that. All data is on separately mounted EBS disks. The root file system is ext4, but all the other files systems were xfs! And the clients only mount the xfs file systems.

Using a suggestion to debug what's going on, I tried:

echo 1 > /sys/kernel/debug/tracing/events/ext4/ext4_sync_file_enter/enable

Waited for a minute and then did:

cat /sys/kernel/debug/tracing/trace

Got a lot of lines like:

nfsd-943   [000] 8559086.521147: ext4_sync_file_enter: dev 202,1 ino 30703 parent 30666 datasync 0
nfsd-942 [000] 8559086.527871: ext4_sync_file_enter: dev 202,1 ino 30703 parent 30666 datasync 0

OK, clearly the NFS daemon causes a lot of datasync() calls. But why would this have an effect on the root file system?

After more googling I found this comment:
Problem vanished after fsck'ing my ext4 partitions

Huh? Worth a try. Stopped NFS server, mounted root disk on another server, ran fsck:

# fsck /dev/xvdf
fsck from util-linux 2.20.1
e2fsck 1.42 (29-Nov-2011)
cloudimg-rootfs: clean, 62399/524288 files, 378952/2097152 blocks

and reattached. Problem solved!!

I have no explanation for this behaviour, except that maybe the latest Ubuntu 12.04 LTS AMI has a bad disk.

PS: I now believe I was wrong, see this update.

The C programming language has been a plague of security problems for decades. When oh when will programmers abandon this language? It's unsafe under any conditions. Who needs type checking? Gotos are not harmul. It's depressing. We have known this stuff since the 60s.

Still using Ed L. Cashin's 2001 backup scripts on my network, a bit sad perhaps. But haven't found a more flexible environment.

I moved it along to support xfs, now zfs, my backups are written to disk first, then concurrently to tape while the next backup start. All without too much work.

All other backup tools look like a bit too much work, and you might get stuck when technology moves on.

Kevin Drum on healthcare insurance, so true:

Let’s be honest. What we all want is unlimited access to medical care; unlimited access to any procedure we want no matter how pricey; unlimited choice of physicians; instant availability of doctors every time we get an ear ache; and we’d like all this for free. That’s what we want. And we’re annoyed when we don’t get it.

At Xplain Hosting we have been hosting Drupal since 2008. First on Ubuntu 8.04, systems built from scratch. To upgrade to another version, you build another system from scratch.

Have now switched to Ansible, so hopefully never have to do that again. And we now have good documentation what a working system looks like.

Upgraded from FreeBSD 8.x-STABLE to FreeBSD 9.2-PRELEASE. Disaster. Went back to FreeBSD 9.1. Still disaster. Every few hours the kernel panics, and the server reboots. Have completed major changes, went from i368 to amd64. Same bug. Somewhere PF has become unreliable in FreeBSD 9.

Not good. Have used FreeBSD since version, never had these problems.

Not sure what to do now. Probably simply my pf rules, which won't be easy, to see if the problem disappears that way.

Have gone to the 20th anniversary of the Manukau Symphony Orchestra (MSO). We've heard them since they played in the Papatoetoe Town Hall.

They had commissioned a piece of Gareth Farr, interesting, but won't work well via CD/Spotify I'm afraid. Usually a problem with modern music. Next piece was Mozart's Violin Concerto No.3 in G, K.216. I'm not a Mozart fan usually, unless played in period instruments, but Loata Mahe gave a great solo, really individual performance.

The big piece was Mahler. The first Mahler for the orchestra, and a really great performance from Uwe Grodd and the orchestra. Uwe was able to draw the music out of the orchestra, especially the beginning of the first movement was very well done. I liked the third movement, first half of fourth movement also went exceptional. Great night.

I'm happy to see that the Te Paipera Tapu, the Bible in Maori from 1868 website has now launched. Still a bit of work to do, but a web presence at last!

14 May 2013 (updated 15 May 2013 at 01:11 UTC) »

Typing gem on windows:

PS D:\Users\berend.deboer> gem
The term 'gem' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spellin
g of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:4
+ gem + CategoryInfo : ObjectNotFound: (gem:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException

Same on Ubuntu:

# gem
The program 'gem' can be found in the following packages:
* rubygems1.8
* rubygems1.9.1
Try: apt-get install <selected package>

Really, Windows just feels like an abandoned house, does anyone do any kind of development on this OS? Maybe all using Visual Studio, never leaving it?

15 Jan 2013 (updated 15 Jan 2013 at 02:37 UTC) »

Lot of pain to get Skype working properly after I switched from an increasingly complex ipfw firewall to a pf based one. Skype consistently gave me "UDP status: local: BAD". I believe that happens when Skype tries to use peers for udp traffic, instead of talking to the caller directly.

The solution was to use static ports for nat. My default rule in pf.conf was this:

nat on egress from any to any -> (egress) round-robin sticky-address

It is now:

nat on egress from port 12345 to any -> (egress) static-port
nat on egress from any to any -> (egress) round-robin sticky-address

Skype now has UDP status local good, finally sigh.

388 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!