Why We Should All Test the New Linux Kernel

Posted 4 Jan 2001 at 03:37 UTC by goingware Share This

Ladies and Gentlemen, we are approaching an important point in the history of Free Software - the imminent release of a major revision to the Linux kernel. Being the foundation to the systems the vast majority of us depend on for our own work, the correctness of the kernel is vital to the proper functioning of the programs those of us on Advogato develop. Please grab the latest 2.4.0-prerelease sources from the Kernel.org mirror site nearest you and give it a thorough test on your equipment, and with your programs.

The current product kernel is 2.2.18, and for a long time the development kernels were the 2.3 series. The 2.4.0-test1 kernel came out earlier in the year and not only provides many new features to the Linux system, it is also a major rearchitecting of the kernel.

I've been working with the 2.4.0-test and 2.4.0-prerelease kernels for most of this year and found that in general they work pretty well. I think I can safely say that they will be fine when used by someone who is a programmer or competent to administer their own Linux system. This is not to say that the kernel is yet trouble-free but is good enough to be worth using by anyone likely to be reading Advogato.

The problem is (and the reason that I post this) is that once 2.4.0 is released, it is likely to be rushed into production use on a lot of end-user systems, many with configurations that have not been adequately tested. I'm hoping that more widespread testing will head off such problems.

This is in part because the users will download the sources and build the kernel themselves because it has features or fixes they need, or because many of the distributions will rush to include it so they can be perceived as "competitive", either with each other or with non-free operating systems.

But the people doing the most active work with the 2.4.0 kernel are the kernel developers themselves, or those few like me who are just working to test it. I don't think there's a tremendous number of people taking the trouble to test it, and even those who spend the most time at it (the kernel developers) often have limited resources for trying different configurations. (I have heard of distributions that prematurely shipped systems with prerelease kernels - something I consider irresponsible.)

A lot of people will have their very first experience with Linux by purchasing a $29 CD distribution "just to check it out". For many of them, the brand-new 2.4.0 kernel will be what they get, and it's very important that they have a positive experience with it. Every bug found by an Advogato reader is a bug that's not found by a couple of thousand novice Linux users who might not come back for more.

It's very important to have the kernel tested on a wide variety of configurations and under the load of a lot of different applications.

For comparison of what's done in the commercial world, I used to work at Apple, at one time as a QA engineer and at another time as an OS engineer doing system debugging. I tested MacTCP, Apple's older TCP/IP stack, and for that I had about three dozen machines in a lab and worked full-time there about a year doing nothing but testing, writing test plans and writing test tools - all that to QA what was then (1990) considered an unimportant component of the system by most of the company.

At the time I was an OS engineer in the mid-90's, I don't know how many QA staff Apple had, but I would guess it numbered 500 or greater, all working full-time, year round to test a system that had far fewer hardware configurations in question than the Linux kernel is expected to support - and note that Apple maintains extremely tight control over the hardware, where Linux is expected to run on just about anything from Internet Appliances to ancient 386 boxes to mainframes.

The kernel has special needs for testing that require it to be done by a wide variety of people for several reasons:

  • It is distributed as highly configurable source code, so it needs to be tried out with lots of different options to try to find combinations that stimulate bugs
  • It supports a number of different instruction set architectures, and different CPU grades for a given architecture, and even mainframe processors (the S/390) - no one owns all those different machines
  • It supports a very large number of hardware devices, which need to be actually installed to do anything interesting. There may be conflicts between different devices that can only be found out by widespread testing
  • Being an interface between user applications and the hardware, the kernel needs to be tested by running lots of different user-mode programs on it, so that a lot of combinations of system calls and other loads on the system get tested - that's why you should test your application on the new kernel
  • Failure of the kernel in a production system usually has a worse impact than failure of a user mode program

If the kernel is flaky, it's obvious your machine can crash and the filesystem can get corrupted and users lose data and the use of their machine either until they reboot or even until the problem is resolved. What could be worse is if a buggy kernel doesn't crash but causes incorrect functioning of an otherwise reliable program - this kind of bug is insidious and can be maddening to track down.

There's a few things you'll need to know to get working with your new kernel.

Usually you want to report bugs to the linux-kernel mailing list at linux-kernel@vger.kernel.org Note the new mailserver - vger.rutgers.edu apparently had a meltdown.

You probably don't want to actually subscribe to the linux-kernel list because of the volume of mail. I suggest reading the list off of an archive, of which there are many. I like this archive. You can find other archives at Google.

It is of course good form to read the linux-kernel mailing list FAQ.

Once you're connected to an archive server the files to look for will be in pub/linux/kernel/v2.4

You'll only need to download the whole kernel source once, then you can download and apply the much smaller patches when they come out (you don't have to try to keep up with all the patches, contribute at a pace that's appropriate for you).

If you download the .bz2 files (which are smaller), use bunzip2 to unpack them, then tar -xvfp to extract them. If you download the .tar.gz files, use tar -xvfzp to extract and uncompress them at the same time (I download the bz2 files to save time, then recompress them with gzip to save space on my machine and use -xvfzp whenever I need to extract them, saving space on my machine).

When you untar the sources, a directory called "linux" will be created. I won't go into how to configure and build the kernel, for that the Kernel Newbies website has the best information.

The one big gotcha I ever found was that to change from running a 2.2 kernel to a 2.4 kernel I needed a new set of modutils, the programs that manage the kernel modules. Without them you'll get a lot of undefined symbols in your modules and your modules won't load right (the new modutils seem to work OK with old kernels). You'll find the new modutils on your local mirror server in pub/linux/utils/kernel/modutils/v2.4

If you want to contribute the most, try to download and apply the patches that come out. If you have a specific problem, and someone posts a patch for it on the mailing list, you can grab the patch out of the email and apply it, or you can get the compilations that are distributed by Linus or Alan Cox, which contain all of the submitted patches that they've approved of - note that sometimes if a patch that fixes something gets submitted, it doesn't always get included in the new compilations, and you need to politely remind the kernel developers that the problem remains and maybe resubmit the patch.

If you've got a patch named patch-2.4-prerelease-ac5 then you apply it to the 2.4.0-prerelease kernel sources by cd'ing into the linux directory (kernel source top level) and executing:

patch -p1 < patch-2.4-prerelease-ac5

Note that patch takes its input from standard input rather than as a command line parameter - don't forget to redirect with <

Linus' patch compilations will be in pub/linux/kernel/testing. Alan Cox's will be in pub/linux/kernel/people/alan/2.4.0test/ Generally Linus' stuff is more official and stable while Alan patches are often the first try at something or experimenting with a fix.

A few more helpful tidbits:

After you configure your kernel a file named .config will be created in the linux directory. This holds all the configuration options you just selected. It is helpful to make a directory somewhere and save copies of your .configs with names that reflect the kernel version and the most significant options you've set in that build. You can then use the saved files to recover earlier kernels for testing at a later date, and if the kernel developers need it you can send the .config file for a kernel that had some bug you're reporting.

If you patch your kernel sources you can get them configured anew the fastest if you use an old .config file and give the command "make oldconfig". You'll be prompted for new items that weren't mentioned in the old file. It's probably best to run through the whole configuration manually when you first create a 2.4 kernel though, as there is a lot of new stuff.

If you have XWindows working on your machine, the most pleasant way to configure your kernel is "make xconfig" (saying this probably marks me as not being a true hacker...). This is also the quickest if you want to just change a few options here and there (it's a GUI configuration tool) or for browsing the config options help. Other possibilities are "make config", "make menuconfig" (for curses-based editing in a terminal), or just manually editing the .config file (not generally recommended because of dependencies between the options).

Finally, if you're working on an Intel-architecture machine, and are trying out frequent new kernels, it is very convenient to install GNU Grub. It is a much more full-featured bootloader than LILO. Chief among its advantages is that it understands various filesystem formats natively, so unlike Lilo which needs to be reinstalled every time a new kernel binary is put in place, once Grub is installed you only need to edit it's menu.lst file if you want to add a totally new kernel name to boot off of in the boot menu - and if you forget, you can boot the kernel by name from the grub command line.

Because it boots the kernel by name rather than physical disk sector, replacing an old kernel with a new one with the same pathname doesn't require you to do anything at all to grub - because LILO uses a sector list, a new kernel with the same path may be in a different physical location on the disk so you have to reinstall it when you put a new kernel in place.

Note that Grub is not yet at 1.0; it works great for me but I suggest starting by making a grub floppy for testing before you install it in your boot sector.

User Mode Linux - run the kernel as a process, posted 4 Jan 2001 at 05:52 UTC by goingware » (Master)

After a reminder in my email from author Jeff Dike I hasten to add a mention of something I meant originally to include in the original article - you can run the Linux Kernel as a user process under some other native Linux system using User Mode Linux.

This was featured on Slashdot a while back.

I haven't tried it yet but this is a great thing. Besides safe testing of new kernels (you install your distribution in a filesystem built out of a single regular file, with controlled access to hardware), you can also use it to test potentially dangerous new software (you don't ever run software you've got off the net as root do you ;-/ ), and it potentially allows one to instrument the kernel in all kinds of ways that would be difficult running off of real hardware. Also you can install new kernels and test them without rebooting your machine, and there is a script that automatically starts up a system, runs a command, and reboots it.

The other thing I hasten to add, especially if you're going to be testing on a real machine that contains important data, is do a backup first, save your work frequently, and make more backups regularly - but if you use user-mode Linux, you don't have to worry.

Some Further Thoughts, posted 5 Jan 2001 at 05:00 UTC by goingware » (Master)

I've got a few things to add, mostly in response to email that has been sent to me by folks who've read the article above. Eventually I'll rewrite and organize everythng a little better and post it it in the articles section (nothing there so far) of the Linux Quality Database (so far just a proposal - wanna help?)

The feedback indicates this is getting read by a lot of people who aren't programmers but do want to help test the kernel, so in some of what follows I give background on some things that should be pretty familiar to most Advogato members.

How Good is the 2.4.0 Kernel Right Now? Should I Feel Safe to Test It?

One fellow wrote in to recommend that I should say that the new kernel works "very well" or at least "well". He felt that my statement that it worked "pretty well" would discourage a lot of people who might otherwise usefully test it.

It is my own experience that I have very little problem with the new kernel, and very likely you won't either. But I hesitate to say anything of substance about how well it's working - if it works at all for you, very likely it will work flawlessly and you'll have the added benefit of whizzy new features and performance enhancements.

But observing the traffic on the linux-kernel mailing list, some people have significant trouble. I feel that if you test it, the benefit will be likely you'll have a nice new toy to play with, but you must accept some risk, and that risk might be that your machine won't boot at the very least - or that it will scrag your filesystem or lose data you've created in a program. So it really should only be tested by people that are prepared to accept the possibility of having to fix their machine or recover their data.

Let me contrast this, however, with the condition of Windows 2000 when it was beta tested. I needed to write some Java meant to run on NT for a consulting job and my client thought it would be fun if I used the Windows 2000 Beta. I would suggest "living hell" is a better way to characterize my experience. I had no end of trouble, and it wreaked lots of havoc with my work - for example, I could not use ethernet and DNS via PPP at the same time (even though I ran Windows 2000 server) and had to disable ethernet and reboot before checking my email.

I understand that The Win2K Problem shipped with 64,000 documented bugs of which 25,000 were considered "serious" by Microsoft, and the opinion was widely held among the industry press and IT managers that one should not install it until a few service packs had been released - but Microsoft shipped it anyway. (To be fair, all those bugs were counted among the entire system and not just the kernel).

I've been running the 2.4.0-test kernels on the machines I use for my daily work since test1 was released. I've had no problems that prevented me from doing useful work. The one serious bug I found was that my Adaptec APA1480 Cardbus SCSI host bus adapter wouldn't function, and that was resolved very early on by working with the mailing list - so now I can burn CD's with a SCSI CD burner off my laptop. The only problem I've got now is that my machine doesn't power itself off when I shut down.

So you be the judge.

Besides Building the Kernel, What Steps Do the Users of a Given Distribution Need to Take to Run the New Kernel?

As far as I know, the only thing that is absolutely required is to install the new modutils package as mentioned above. The modutils are user programs that manage kernel modules, generally device drivers that may be loaded into or removed from the kernel at runtime. The module format has changed in 2.4, so that's why you need the new version.

All of your existing user-mode programs, applications and libraries should continue to work without the need to update their source or even recompile. Binary compatibility with user programs that ran on old kernel versions is a basic requirement for the system.

I have seen reports that some existing app would crash when run under the new kernel. This isn't an error on the user's part, usually, but a bug in the kernel, and should be reported to the mailing list.

There are some new kernel features that require user programs to take advantage of them. You don't need them to run the new kernel on your old system. I don't know what they all are, but they are mentioned in the kernel config help - if you select the help when examining a configuration option, sometimes the help will refer you to other documentation or to a website that will tell you about the new software you need.

I know one feature that is probably too radical for most casual users to want to mess with on an existing distribution install. This is the DevFS filesystem. With DevFS, the /dev directory is initially empty and special files are created when a driver loads (either at boot time or when its module is loaded) and it disappears when it's module is unloaded.

This is a vast architectural improvement but you probably don't want to just slap it into an existing distro that expects its /dev files to stay put, and there are some issues about managing these files that need to be dealt with (like how to set the default permissions on one of these dynamic files). Anybody but the Linux From Scratch people will probably want to wait for a distro that supports that as an integrated whole.

Monkeywrenching the Virtual Machine

I'd like to say a few additional words about why it is so important that the quality of the kernel, not just for Linux but any operating system, must be so high. One could argue that it's just as critical that the system libraries be error free because an error in a library could affect any program that uses it, but really the kernel is a special case.

This is because of the non-local effects of having the virtual machine break down.

Reliably functioning computer programs, both kernels and user-mode programs, are virtual machines, of which the parts are the data structures and the algorithms which operate on them. We have stacks, queues, lists, subroutines, interrupts (both hardware interrupts in the kernel and software interrupts in use programs such as signals), threads, locks and so on.

Our programming languages, libraries and kernels give us a wide array of machine parts and then we assemble these into very elaborate machines that, if rendered as physical mechanisms, would put the finest sportscar to shame - as long as the programs are written correctly.

The problem is if you've got certain kinds of bugs in your program, such as heap corruption, buffer overflows, race conditions, failure to protect a critical region, then all hell brakes loose. It's as if the Army pulled a Howitzer up to your nice sportscar and put a shell through the engine - but then it kept running. Programs don't explode when they're damaged, they're happy to continue running along, executing each instruction in sequence, but they're likely not doing what you want.

Consider yourself lucky if you get a segment violation - at least then you find out right away something is wrong, rather than an hour later after you've saved your work to disk into a file that turns out to be corrupt.

I discussed this in a letter entitled Algorithms have unclear boundaries that I originally wrote to the patent office and also submitted to the Forum on Risks to the Public in Computers and Related Systems. (I recommend that anyone who uses computers read Risks - years of following the Risks Forum is what made me such a freak about software quality).

I once followed a discussion of programming assertions on the Usenet News. Assertions are tests included in debug builds of programs that test that a condition that must be true actually is true. If the condition is found to be false then the program is halted immediately so the programmer can check out what's wrong. Assertions speed software development by catching your mistakes quicker, doing some testing automatically for you every time you run the program.

One common practice is to test that an impossible condition is not true, for example, if a variable is allowed to hold one of three values then you assert that it does not contain a fourth. But one participant in the discussion argued vehemently that if he could prove, through the logical flow of the program code as written, that an impossible condition could never occur, it was a waste of time to include assertions that tested for impossibilities.

I feel that he was wrong though, and it's likely he spends a lot of extra time needlessly debugging his programs that he could save by using more assertions. His argument only holds while the virtual machine is intact. When the virtual machine breaks down, impossible conditions start coming fast and hard, and peppering your code with assertions will warn you right away this is happening. It's impossible to know ahead of time what impossible conditions to test for, in practice you test for them wherever its convenient.

Now how does this long theoretical discussion apply to the kernel?

Normal user mode programs on modern operating systems like Linux run in protected memory, in which the program has the perception it possesses the entire memory space of the whole machine and it is impossible for one program to use a memory access to affect another. The protected memory is managed by the kernel and enforced by the memory management unit, a component of modern microprocessors.

If the virtual machine of one user mode program breaks down, it may act erratically or be terminated by the system, but it is unlikely that it will harm any other programs.

Besides keeping the system more reliable for users and protecting user data, protected memory makes life easier for programmers because an error in your program will at worst terminate the application. You find out right away something is wrong, if you're using a debugger you get helpful information on what the problem is, and your program doesn't crash the machine so you don't have to wait to reboot to continue your work.

Don't take protected memory for granted - there are lots of systems that still don't have it. The classic Mac OS doesn't, and I've spent much time in my career waiting for a Mac to restart because of some silly pointer bug. The BSD/Mach-based Mac OS X that is currently in beta testing will be Apple's first publicly released, widely used protected memory OS (there was also A/UX, an early Mac Unix, but it wasn't meant for widespread consumption).

User mode programs on Linux can affect each other, but they do it through carefully managed channels of communication that are directed by the kernel. Most familiar are TCP/IP networking and files on the hard drive, but there's also Unix domain sockets, pipes and signals. Programs can expose the guts of their memory to direct access by other programs by using shared memory via such methods as the mmap system call, but they only do this when they want to and typically they do not expose critical data.

These are all well-defined communications pathways. It is possible for one program to crash another through one of these pathways (for example, by writing a corrupt file to disk that is used by another program) but it is much harder in general and even then the problem is localized.

The kernel is a special case, though. In itself, it is a particularly complex virtual machine - both within its own operation, and in the system call and special device file interface it presents to user programs - it presents the hardware to the user programs as an external virtual machine. It sits in the middle of everything, between each user program and the hardware, between different pieces of hardware that communicate with each other via hardware buses and DMA, and between user programs running together on the same machine and even on different machines that are communicating via a network protocol.

The kernel effectively has root privelige on your machine. If a program has lesser privelige, that is because the kernel is enforcing that policy - but in reality, the kernel can do anything it wants if it should get an inclination to.

It all runs in one big virtual machine. The kernel does not have protected memory within itself. The situation is complicated because parts of the kernel run within the virtual memory space of user programs, and the kernel manages the memory spaces itself, and also makes direct access to physical memory, so the memory architecture of the running kernel is a complicated thing. But there's really no protection against some part of the kernel screwing up another part.

And if the kernel's virtual machine breaks down, just a little bit, not so much as to bring your machine crashing down, you can create pathological communications pathways within the kernel.

An extreme case (I haven't seen this actually happen) would be a pointer bug in a device driver that caused the driver to overwrite some critical memory data structure that was used by a journaled filesystem like ReiserFS. Lots of people think journaled filesystems are completely reliable because they arrange to write filesystem metadata only atomically. First the metadata is streamed into the journal, and only after it is complete is it then copied to the filesystem itself, and it is done in such a way that if the process is interrupted at any time (as by a power failure) then the integrity of the filesystem will be preserved.

But what if a buggy driver scrawls some bogus data into the memory used by the journaled filesystem just before it's written to disk? Think about that the next time you install the driver for some oddball piece of hardware into the computer you're using to write your memoirs.

Something I have seen happen many times, when I was a "Debug Meister" at that Big Fruit Company in Cupertino, is for an error in the operating system (the Mac OS System in this case) to screw up data structures used by some other part of the system during some system call. When a user application later makes that system call, something else happens other than was documented by Inside Macintosh - the system behaves incorrectly, or returns bogus results.

The most straightforward and methodical way to test this is by writing test tools that try out all the different system calls, and vary their parameters over the acceptable ranges and ensure that the results returned are also within the documented range. You also try making system calls with illegal parameters to ensure that an appropriate error code is returned.

This is valuable, but the tools are tedious to write and often don't exercise the system all that well. I don't see a lot of these kind of tools available in the Free Software community but it would be valuable to write some (that's part of what I did as a QA engineer at Apple).

What is also very valuable is to stimulate the kernel with many applications that are otherwise expected to work reliably, because they have worked reliably with previous kernels. There are far more programs meant for some real purpose than there are test tools and so using these you can get much broader coverage than a test tool would typically do. They're usually more interesting to spend your days with too.

You want to try out these applications on lots of different hardware configurations because of the problems of hardware-dependent code creating pathological communications pathways with the programs. And in fact at Apple it was very common that a tester would report that some commercial application would work reliably on one model of Macintosh with a new version of the System, but not another, and often this was because of some bug in a hardware driver that surfaced in the misbehavior of a video game or spreadsheet.

At this point I've probably scared you beyond wanting to test at all. But the situation is not as grim as it might sound. The kernel wouldn't work very well at all if it was not highly reliable to start with, and there are some things about the kernel and the way it is developed that make it much more robust than is likely to be the case with other operating system kernels.

One factor that adds to Linux' reliability is that it is cross-platform. It supports a number of different microprocessors as well as the S/390 mainframe processor. It is used on a very inhomogeneous population.

Another is that it is distributed as configurable source code. There are widely varying options for some ways the kernel will work, and even with one set of features for a given architecture you can choose to optimize for a particular processor.

These are good news because they help to bring out latent bugs. Some bugs only cause trouble rarely, or don't show up at all but rear their head after a major modification to the system. But since the kernel is distributed as source code, and built for many different systems, it is likely that the different conditions of one system - often the fact that memory is laid out differently, or that the code is built with different options - will stimulate the bug repeatibly on at least configuration so it can be found and fixed early.

Contrast this with, say, the Windows 2000 kernel, which only works on Intel-architecture microprocessors, all of which run code copied from a single build of the system by Microsoft's release engineers. This is a very homogeneous population and they do not have the benefit that varying so many parameters brings to Linux. Note also that when Be, Inc. ported the BeOS to the Intel architecture from PowerPC, although they found that there was vastly more market interest in Pentium BeOS than PowerPC BeOS, they still support the PowerPC version because it helps to ensure the quality of their code - I'm sure that Microsoft, at least Microsoft's engineers, will ultimately regret abandoning PowerPC and Alpha for this reason.)

(By the way, this is one benefit of doing cross-platform development of user applications too. You definitely want to get people who use different processors to work with your code and if possible make it work with other compilers than gcc and on different operating systems entirely - it makes your code very robust).

Also, many of the kernel developers have been using the development kernels on their own personal machines for a long time and often have subjected them to heavy stress testing loads. There's been a lot of time in development for kernel bugs to be found and fixed.

So it's not all that likely that you're going to have really brain-damaged behaviour.

I'm so concerned about it not because I think it will be common, but that if it happens it will be hard for the people it happens to to track it down - it would appear that there was a bug in a program that wasn't at fault, and that program's developers probably wouldn't have the same kernel bug so they wouldn't be able to figure it out.

It would be best if such problems were found in testing rather than in production machines, or on a machine owned by someone who wasn't an expert user.

Bjarne Stroustrup on the Humanity of Programming, posted 9 Jan 2001 at 06:59 UTC by goingware » (Master)

Lots of people think C++ is a screwed up language, and I must admit I've seen my share of incomprehensible C++ code. But let's not mistake the sins committed in the name of the creator for his message. (I've found I can write beautifully appealing code in C++.)

I highly recommend the chapters on software development and design in The C++ Programming Language Special Edition to anyone, not just to C++ programmers. (The first of these chapters is of most interest to programmers in general, I think the next to anyone doing object-oriented programming, and the third is specific to C++, as I recall).

Of most interest here is page 716, 23.5.3 Individuals:

Use of design as described here places a premium on skillful designers and programmers. Thus, it makes the choice of designers and programmers critical to the success of an organization.

Managers often forget that organizations consist of individuals. A popular notion is that programmers are equal and interchangeable. This is a fallacy that can destroy an organization by driving out many of the most effective individuals and condemning the remaining people to work at levels well below their potential. Individuals are interchangeable only if they are not allowed to take advantage of skills that raise them above the absolute minimum required for the task in question. Thus, the fiction of interchangeability is inhumane and inherently wasteful.

I was pretty astounded when I read that. Not that Bjarne had said it - I'd exchanged a few emails with him and had always been impressed with his thoughtfulness. But I've read quite a lot of stuff about the management of programmers, formal methodologies and the like, and this was the very first time I'd found the advice that management should be humane.

Where do you enter the humanity into your chart in Microsoft Project?

Regarding not sleeping and coding all by yourself, please read my piece on Large Scale Individual Software Development on WikiWikiWeb. Because it's on Wiki, you can edit it and add comments. For a little taste:

"So he decided to watch what the government was doing, scale it down to size, and live his life that way."
- Laurie Anderson (quoted approximately from memory)

I spent about 11 months last year writing a vector graphic editor in C++ using the ZooLib cross-platform application framework. When I made a build for the client, I usually delivered for both Mac and Windows from the exact same codebase (it also supports BeOS and POSIX platforms such as Linux).

My client got the bright idea to require me to deliver a new build once a week. She wanted, among other things, to be able to show the investors we were making regular progress towards our goal by actually demonstrating my development builds to them.

So for several months my focus was on little more than getting them the next build, with a few demonstrable, completely implemented and debugged new visible features each week rather than making meaningful forward progress on the program by doing more important things like laying architectural groundwork that may not be immediately visible to the user. I wasn't going to be caught dead delivering a build (way before alpha) that crashed in front of an investor.

I finally called her up in desperation and told her I simply could not work under such conditions anymore.

The result of that? Development progress accellerated.

But there was no satisfying the client, who wanted the product as soon as possible and constantly called to check on progress. They made it clear they were upset at me taking four days off from work for my own wedding on July 22. I went months at a time without a day off and many times worked 24 hours at a stretch.

There was simply no satisfying them though. They were not technical people and although they claimed they trusted my judgement and explanations, I often had the feeling they were just pretending to understand me when I tried to explain to them why the program they had spec'ed for me was very difficult to write.

In the end, a substantial amount of money was due and I had just worked a 29 hour day slaving to get them their feature complete beta. Although we had a regular invoicing schedule they asked me to do them the favor of not charging them until the beta was delivered. I had a highly debugged program, complete but for one feature, which although it was important, wouldn't be hard to write once I had a few days to rest.

I'd gone a long time without getting paid because of this favor I'd granted them, and because of it they owed me a lot more money than they would for the regular invoice.

At the end of this 29 hour day the client called to say that they wouldn't give me the money they owed me until I delivered the product feature complete. They didn't owe me for feature completion - they owed me for several weeks of work I'd done before.

I told them if they did that "I'd terminate our business relationship" and hung up the phone. A few days later I received an email acknowledging that I'd ended the contract (which I hadn't - I only told them I would if they didn't pay me).

Increasingly stern letters from my attorney have gone unanswered. I'm trying to figure out a good collection agency in San Francisco. I'd really meant to put them into collections just before Christmas to return the special Christmas gift they gave me but I was too busy trying to recover from the mess.

That's what you get for not working like a normal human being. Beaten to death and kicked in the teeth as thanks.

They seemed like such nice people when we started the project. The best advice I can give anyone here is to learn to be a good judge of character - I have a hard time with that, but my wife is very good, and many times I've wished I'd listened to her sooner. Anyone can seem like they're honest, good-natured people when things are going good, but will they still be there to support you when, say, the high-tech stock market collapses, taking with it the investment community's interest in internet startups?

Bug in Advogato Posted the Above in Wrong Article, posted 9 Jan 2001 at 07:56 UTC by goingware » (Master)

If you're confused about what my comment above could possibly have to do with the testing of the linux kernel, it doesn't, it was meant for a comment on Getting Over Bad Habits which was the next article posted after mine.

This is apparently a bug in the advogato software, one which I've seen before.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Share this page