Older blog entries for apenwarr (starting at number 583)

2 Jan 2011 »

Happy New Year

If, for some reason, we make some giant mistakes and IBM wins, my personal feeling is that we are going to enter sort of a computer Dark Ages for about 20 years. Once IBM gains control of a market sector, they almost always stop innovation. They prevent innovation from happening.

-- Steve Jobs in 1985

Ouch.

Syndicated 2011-01-02 03:05:25 from apenwarr - Business is Programming

22 Dec 2010 »

git vs. professionalism

I followed a random link today that led me to the complete email archive of the discussion, following Theo de Raadt's expulsion from NetBSD in 1994, that led to the creation of OpenBSD.

Check out this quote:

I want _EVERYONE_ working on NetBSD to behave in a professional manner, in public _AND_ in private. Obviously, the latter can't be enforced, but if people complain about "unprofessional" behaviour, then it doesn't rightly matter _where_ it occurred.

-- a NetBSD core team member

The entire email chain - from both sides - is totally pointless politics. But it's educational, because it reminds us of a few things programmers have learned since then:

enforced professionalism sounds scarily draconian;
the best programmers are not the best company representatives, and expecting them to be is silly;
withholding technical stuff (CVS access) in exchange for political stuff (an apology) just results in angst and wasted time;
accepting/refusing code based on anything other than technical merits (and valid licensing, I suppose) is stupid;
when you give a "core team" special powers, they abuse those powers;
you can hold emotional power over a programmer by controlling whether their work is used or not used, and that power needs to be used responsibly.

When I say we've learned these things, maybe I'm just being optimistic. Probably there are plenty of teams with equally stupid politics going on. I hear the Java Community Process is similarly a waste of time.

What would happen if you tried to kick Linus Torvalds off the "Linux core team?" He would laugh at you. There is no core team. There is nothing to kick him off of. There is no CVS commit access to revoke. He can be as much of a bastard as he wants, and a bunch of self-appointed "project representatives" can't stop him from giving us all the awesome code he wants to.

Of course, Linus is synonymous with Linux; if anyone was going to be "kicked out", it would be anyone or everyone but him. So it's even more funny that he's the guy who wrote git, the tool that makes it impossible to block anybody. Now he can be even more of a bastard, because even if people hate him and quit, they'll still share their code with the world, including him.

As far as I know, all the BSDs still use CVS. Of course they do. Switching away from that would be like admitting their whole world view is wrong.

that is what the above is. if it has anger in it, then that is what is in it. it is an honest and open sharing of ideas and feelings. those feelings include anger, hurt, apprehension, loss. but i'm trying to find a way to forget about that past stuff, and you're not making it easy.

-- Theo de Raadt

Syndicated 2010-12-22 07:53:04 from apenwarr - Business is Programming

14 Dec 2010 »

The only build system that might someday replace make...

...is djb redo.

There are only two problems. In order of increasing difficulty:

1. you've never heard of it.

2. it doesn't exist.

Well, I just solved problem #1. Progress!

Problem #2 is a lot tougher. You see, the page linked above talks about a theoretically great build system, which perhaps D. J. Bernstein (djb), maker of qmail and djbdns, has implemented part of. But he doesn't link to an implementation, probably because his own code isn't up to his (rather exacting) standards yet, or he's gotten busy with other things and hasn't finished thinking through all the details.

I would like to tell you, in a concise way, exactly why redo is literally the most amazingly groundbreaking build system since the original invention of 'make', but I don't know how. So lacking conciseness, I'll just make a few grandiose claims:

it can do everything make can do;
with no baked-in assumptions about what you're building;
with much less code;
with much greater parallelism;
with finer-grained dependencies;
with much less syntax (actually nothing but /bin/sh);
while supporting recursion and full dependency information simultaneously (no Recursive Make Considered Harmful crap);
yet build scripts are highly modular and readable;
and you can checksum your targets instead of using timestamps;
and your build scripts run linearly instead of an orderless "ruleset";
with no implicit rules required;
and implementing C header autodependencies is completely sane;
and dependency checks involve no forking or parsing so it's crazy fast;
and you can incrementally convert parts of your project;
because it can play well with other build systems;
including jobserver compatibility with make -j;
oh, and you can write a plug-compatible toy
implementation in 100 lines of shell.

Sounds awesome? You bet it is. Well, except for the not existing part.

So I wrote it

Yes. djb's design is so simple that I thought I could do it in a couple of days. Actually I did have a working version in a couple of days. And normally, I'm a big fan of "release early, release often." But when it comes to build systems, well, you know, it's kind of a crowded market. I figured I'm only going to get one chance to convince you that this is the most fabulous thing ever.

That's why I've spent more than a month - days, evenings, and weekends - of my so-called vacation¹ working on this stupid thing. It has man pages. It has a ridiculously huge README+FAQ. It has an installer. It has parallelism. It has unit tests, oh god, the unit tests. It has locking. In fact, it has so much parallelism and locking that it uncovered a race condition in MacOS X's fcntl(). It has a GNU make compatible jobserver, so that 'redo -j5' and 'make -j5' can call each other recursively and "just work." It has a few extensions that djb didn't talk about, like checksum-based dependencies. For testing, I converted a pretty huge build system (one that builds the Linux kernel and a bunch of libraries for two different target platforms), with some of my targets depending on more than 12000 files. redo can check every single fine-grained dependency in the whole project in about 4 seconds.

Version 0.01 actually delivers on every one of those grandiose claims.

As I'm writing this, it's 1574 lines of python. That's a lot smaller than the source for GNU make.

And for fun, I also wrote a super-stripped-down "forwards compatible" implementation (without support for actual dependencies; it just assumes everything is always dirty) in 100 lines of shell. That's less than 2 kbytes, and it works with any input file that works with full-sized redo (modulo any as-yet-undiscovered bugs), because djb's design is that awesome.

And you know what? I almost certainly still, after all that effort, screwed up and it probably still won't work for you. Dammit. That's why I paranoidly called it version 0.01 instead of 1.00. But please give it a try:

apenwarr's redo on github

(The super oversized README is visible at that link. Or you can read the man pages.)

Disclaimer

Because redo is currently written in python (and it does a lot of fork-exec calls), it runs a little slower than I would like. Someday rewriting it in C would make it faster - probably at least 10x faster. But it's reasonably fast already, and the increased parallelism and faster dependency checking often makes up for the slowness.

The non-optimal performance is the only significant flaw I'm aware of at the moment... but I'm sure you'll try it out and tell me otherwise, right?

You should join the redo mailing list

You can find the mailing list on Google Groups at http://groups.google.com/group/redo-list.

Of course the mailing list archives are currently empty, because this is the first time I've announced it.

It might not look like it, but yes, you can subscribe without having a Google Account. Just send a message here: redo-list+subscribe@googlegroups.com.

Footnote

¹ I told the guard at the U.Shttp://apenwarr.ca/log/Canada border that I was "between jobs." He offered his sympathies. I said, "No, I mean, literally." He said, "Oh, normally people use that as a euphemism." Anyway, it's now painfully clear that I suck at vacations.

Syndicated 2010-12-14 11:34:07 from apenwarr - Business is Programming

13 Dec 2010 (updated 14 Dec 2010 at 04:06 UTC) »

Everything you never wanted to know about file locking

(Foreshadowing: I found a bug in MacOS X 10.6's fcntl(F_SETLK) locking that could cause corruption of sqlite databases. To see if your system has the bug, compile and run locky.c.)

I've never previously had the (mis)fortune of writing a program that relied on file locking. Well, I've used databases and gdbm and whatnot, and they "use" file locking, but they've always hidden it behind an API, so I've never had to actually lock my own files.

I heard file locking is terribly messy and arcane and varies wildly between operating systems, even between different Unix systems; even between different versions of the same system (like Linux). After some experimentation, I can confirm: it really is that arcane, and it really is that variable between systems, and it really is untrustworthy. I'm normally a pessimist when it comes to computers, but Unix file locking APIs have actually outdone even my own pessimism: they're worse than I ever imagined.

Other than simple lockfiles, which I won't go into (but which you might just want to use from now on, after reading this article :)), there are three Unix file locking APIs: flock(), fcntl(), and lockf().

flock() locking

flock() is the simplest sort of lock. According to my Linux man page, it dates back to BSD. It is *not* standardized by POSIX, which means that some Unix systems (probably SysV-related ones, I guess) don't support flock().

flock() locks an entire file at a time. It supports shared locks (LOCK_SH: multiple people can have the file locked for read at the same time) and exclusive locks (LOCK_EX: only one person can make an exclusive lock on the file; shared and exclusive locks are not allowed to coexist). If you learned about concurrency in textbooks, flock() locks are "reader/writer" locks. A shared lock is a reader lock, and an exclusive lock is a writer lock.

According to the Linux man page for flock(), flock() does not work over NFS.

Upgrading from a shared flock() lock to an exclusive one is racy. If you own a shared lock, then trying to upgrade it to an exclusive lock, behind the scenes, actually involves releasing your lock and acquiring a new one. Thus, you can't be guaranteed that someone else hasn't acquired the exclusive lock, written to the file, and unlocked it before your attempt at upgrading the lock returns. Moreover, if you try to upgrade from shared to exclusive in a non-blocking way (LOCK_NB), you might lose your shared lock entirely.

Supposedly, flock() locks persist across fork() and (unlike fcntl locks, see below) won't disappear if you close unrelated files. HOWEVER, you can't depend on this, because some systems - notably earlier versions of Linux - emulated flock() using fcntl(), which has totally different semantics. If you fork() or if you open the same file more than once, you should assume the results with flock() are undefined.

fcntl() locking

POSIX standardized the fcntl(F_SETLK) style of locks, so you should theoretically find them on just about any Unix system. The Linux man page claims that they've been around since 4.3BSD and SVr4.

fcntl() locks are much more powerful than flock() locks. Like flock(), there are both shared and exclusive locks. However, unlike flock(), each lock has an associated byte range associated with it: different byte ranges are completely independent. One process can have an exclusive lock on byte 7, and another process can have an exclusive lock on byte 8, and several processes can have a shared lock on byte 9, and that's all okay.

Note that calling them "byte ranges" is a bit meaningless; they're really just numbers. You can lock bytes past the end of the file, for example. You could lock bytes 9-12 and that might mean "records 9-12" if you want, even if records have variable length. The meaning of a fcntl() byte range is up to whoever defines the file format you're locking.

As with all the locks discussed in this article, these byterange locks are "advisory" - that is, you can read and write the file all day long even if someone other than you has an exclusive lock. You're just not supposed to do that. A properly written program will try to acquire the lock first, at which time it will be "advised" by the kernel whether everything is good or not.

The locks are advisory, which is why the byte ranges don't have to refer to actual bytes. The person acquiring the lock can interpret it however you want.

fcntl() locks are supposedly supported by NFSv3 and higher, but different kernels do different random things with it. Some kernels just lock the file locally, and don't notify the server. Some notify the server, but do it wrong. So you probably can't trust it.

According to various pages on the web that I've seen, fcntl() locks don't work on SMB (Windows file sharing) filesystems mounted on MacOS X. I don't know if this is still true in the latest versions of MacOS; I also don't know if it's true on Linux. Note that since flock() doesn't work on NFS, and fcntl() doesn't work on SMB fs, that there is no locking method that works reliably on all remote filesystems.

It doesn't seem to be explicitly stated anywhere, but it seems that fcntl() shared locks can be upgraded atomically to fcntl() exclusive locks. That is, if you have a shared lock and you try to upgrade it to an exclusive lock, you can do that without first releasing your shared lock. If you request a non-blocking upgrade and it fails, you still have your shared lock.

(Trivia: sqlite3 uses fcntl() locks, but it never uses *shared* fcntl() locks. Instead, it exclusively locks a random byte inside a byterange. This is apparently because some versions of Windows don't understand shared locks. As a bonus, it also doesn't have to care whether upgrading a lock from shared to exclusive is atomic or not. Update 2010/12/13: specifically, pre-NT versions of Windows had LockFile, but not LockFileEx.)

fcntl() locks have the handy feature of being able to tell you which pid owns a lock, using F_GETLK. That's pretty cool - and potentially useful for debugging - although it might be meaningless on NFS, where the pid could be on another computer. I don't know what would happen in that case.

fcntl() locks have two very strange behaviours. The first is merely an inconvenience: unlike nearly everything else about a process, fcntl() locks are not shared across fork(). That means if you open a file, lock some byte ranges, and fork(), the child process will still have the file, but it won't own any locks. The parent will still own all the locks. This is weird, but manageable, once you know about it. It also makes sense, in a perverse sort of way: this makes sure that no two processes have an exclusive lock on the same byterange of the same file. If you think about it, exclusively locking a byte range, then doing fork(), would mean that *two* processes have the same exclusive lock, so it's not all that exclusive any more, is it?

Maybe you don't care about these word games, but one advantage of this absolute exclusivity guarantee is that fcntl() locks can detect deadlocks. If process A has a lock on byte 5 and tries to lock byte 6, and process B has a lock on byte 6 and tries to lock byte 5, the kernel can give you EDEADLK, which is kind of cool. If it were possible for more than one process to own the same exclusive locks, the algorithm for this would be much harder, which is probably why flock() locks can't do it.

The second strange behaviour of fcntl() locks is this: the lock doesn't belong to a file descriptor, it belongs to a (pid,inode) pair. Worse, if you close *any* fd referring to that inode, it releases all your locks on that inode. For example, let's say you open /etc/passwd and lock some bytes, and then call getpwent() in libc, which usually opens /etc/passwd, reads some stuff, and closes it again. That process of opening /etc/passwd and closing it - which has nothing to do with your existing fd open on /etc/passwd - will cause you to lose your locks!

That behaviour is certifiably insane, and there's no possible justification for why it should work that way. But it's how it's always worked, and POSIX standardized it, and now you're stuck with it.

An even worse example: let's say you have two sqlite databases, db1 and db2. But let's say you're being mean, and you actually make db1 a hardlink to db2, so they're actually the same inode. If you open both databases in sqlite at the same time, then close the second one, all your open sqlite locks on the first one will be lost! Oops! Except, actually, the sqlite guys have already thought of this, and it does the right thing. But if you're writing your own file locking routines, beware.

So anyway, beware of that insane behaviour. Also beware of flock(), which on some systems is implemented as a call to fcntl(), and thus inherits the same insane behaviour.

Bonus insanity feature: the struct you use to talk to fcntl() locks is called 'struct flock', even though it has nothing to do with flock(). Ha ha!

lockf() locking

lockf() is also standardized by POSIX. The Linux man page also mentions SVr4, but it doesn't mention BSD, which presumably means that some versions of BSD don't do lockf().

POSIX is also, apparently, unclear on whether lockf() locks are the same thing as fcntl() locks or not. On Linux and MacOS, they are documented to be the same thing. (In fact, lockf() is a libc call on Linux, not a system call, so we can assume it makes the same system calls as fcntl().)

The API of lockf() is a little easier than fcntl(), because you don't have to mess around with a struct. However, there is no way to query a lock to find out who owns it.

Moreover, lockf() may not be supported by pre-POSIX BSD systems, it seems, so this little bit of convenience also costs you in portability. I recommend you avoid lockf().

Interaction between different lock types

...is basically undefined. Don't use multiple types of locks - flock(), fcntl(), lockf() - on the same file.

The MacOS man pages for the three functions proudly proclaim that on MacOS (and maybe on whatever BSD MacOS is derived from), the three types of locks are handled by a unified locking implementation, so in fact, you *can* mix and match different lock types on the same file. That's great, but on other systems, they *aren't* unified, so doing so will just make your program fail strangely on other systems. It's non-portable, and furthermore, there's no reason to do it. So don't.

When you define a new file format that uses locking, be sure to document exactly which kind of locking you mean: flock(), fcntl(), or lockf(). And don't use lockf().

Mandatory locking

Stay far, far away, for total insanity lies in wait.

Seriously, don't do it. Advisory locks are the only thing that makes any sense whatsoever. In any application. I mean it.

Need another reason? The docs say that mandatory locking in Linux is "unreliable." In other words, they're not as mandatory as they're documented to be. "Almost mandatory" locking? Look. Just stay away.

Still not convinced? Man, you really must like punishment. Look, imagine someone is holding a mandatory lock on a file, so you try to read() from it and get blocked. Then he releases his lock, and your read() finishes, but some other guy reacquires the lock. You fiddle with your block, modify it, and try to write() it back, but you get held up for a bit, because the guy holding the lock isn't done yet. He does his own write() to that section of the file, and releases his lock, so your write() promptly resumes and overwrites what he just did.

What good does that do anyone? Come on. If you want locking to do you any good whatsoever, you're just going to have to acquire and release your own locks. Just do it. If you don't, you might as well not be using locks at all, because your program will be prone to horrible race conditions and they'll be extra hard to find, because mandatory locks will make it *mostly* seem to work right. If there's one thing I've learned about programming, it's that "mostly right" programs are *way* worse than "entirely wrong" programs. You don't want to be mostly right. Don't use mandatory locks.

Bonus feature: file locking in python

python has a module called "fcntl" that actually includes - or rather, seems to include - all three kinds of locks: flock(), fcntl(), and lockf(). If you like, follow along in the python source code to see how it works.

However, all is not as it seems. First of all, flock() doesn't exist on all systems, apparently. If you're on a system without flock(), python will still provide a fcntl.flock() function... by calling fcntl() for you. So you have no idea if you're actually getting fcntl() locks or flock() locks. Bad idea. Don't do it.

Next is fcntl.fcntl(). Although it pains me to say it, you can't use this one either. That's because it takes a binary data structure as a parameter. You have to create that data structure using struct.pack(), and parse it using struct.unpack(). No problem, right? Wrong. How do you know what the data structure looks like? The python fcntl module docs outright lie to you here, by providing an example of how to build the struct flock... but they just made assumptions about what it looks like. Those assumptions are definitely wrong if your system has 64-bit file offset support, which most of them do nowadays, so trying to use the example will just give an EINVAL. Moreover, POSIX doesn't guarantee that struct flock won't have other fields before/after the documented ones, or that the fields will be in a particular order, so even without 64-bit file offsets, your program is completely non-portable. And python doesn't offer any other option for generating that struct flock, so the whole function is useless. Don't do it. (You can still safely use fcntl.fcntl() for non-locking-related features, like F_SETFD.)

The only one left is fcntl.lockf(). This is the one you should use. Yeah, I know, up above I said you should avoid lockf(), because BSD systems might not have it, right? Well yeah, but that's C lockf(), not python's fcntl.lockf(). The python fcntl module documentation says of fcntl.lockf(), "This is essentially a wrapper around the fcntl() locking calls." But looking at the source, that's not quite true: in fact, it is *exactly* a wrapper around the fcntl() locking calls. fcntl.lockf() doesn't call C lockf() at all! It just fills in a struct flock and then calls fcntl()!

And that's exactly what you want. In short:

in C, use fcntl(), but avoid lockf(), which is not necessarily the same thing.
in python, use fcntl.lockf(), which is the same thing as fcntl() in C.

(Unfortunately, although calling fnctl.lockf() actually uses fcntl() locks, there is no way to run F_GETLK, so you can't find out which pid owns the lock.)

Bonus insanity feature: instead of using the C lockf() constants (F_LOCK, F_TLOCK, F_ULOCK, F_TEST), fcntl.lockf() actually uses the C flock() constants (LOCK_SH, LOCK_EX, LOCK_UN, LOCK_NB). There is no conceivable reason for this; it literally just takes in the wrong contants, and converts them to the right ones before calling fcntl().

So that means python gives you three locks in one! The constants from flock(), the functionality of fcntl(), and the name lockf(). Thanks, python, for making my programming world so simple and easy to unravel.

Epilogue

I learned all this while writing a program (in python, did you guess?) that uses file locking to control concurrent access to files. Naturally, I wanted to pick exactly the right kind of locks to solve my problem. Using the logic above, I settled on fcntl() locks, which in my python program means calling fcntl.lockf().

So far, so good. After several days of work - darn it, I really hate concurrent programming - I got it all working perfectly. On Linux.

Then I tried to port my program to MacOS. It's python, so porting was pretty easy - ie. I didn't change anything - but the tests failed. Huh? Digging deeper, it seems that some subprocesses were acquiring a lock, and sometime later, they just didn't own that lock anymore. I thought it might be one of the well-known fcntl() weirdnesses - maybe I fork()ed, or maybe I opened/closed the file without realizing it - but no. It only happens when *other* processes are locking byteranges on the same file. It appears the MacOS X (10.6.5 in my test) kernel is missing a mutex somewhere.

I wrote a minimal test case and filed a bug with Apple. If you work at Apple, you can find my bug report as number 8760769.

Dear Apple: please fix it. As far as I know, with this bug in place, any program that uses fcntl() locks is prone to silent file corruption. That includes anything using sqlite.

Super Short Summary

don't use flock() (python: fcntl.flock()) because it's not in POSIX and it doesn't work over NFS.
don't use lockf() (python: does not exist) because it's not in BSD, and probably just wraps fcntl().
don't use fcntl() (python: fcntl.lockf()) because it doesn't work over SMB on MacOS, and actually, on MacOS it doesn't work right at all.

Oh, and by the way, none of this applies on win32.

Are we having fun yet? I guess lockfiles are the answer after all.

I bet you're really glad you read this all the way to the end.

Syndicated 2010-12-13 11:03:10 (Updated 2010-12-14 04:06:09) from apenwarr - Business is Programming

12 Dec 2010 (updated 14 Dec 2010 at 04:06 UTC) »

Two things about programming

2. If you think you have a reliable system for doing it, you're probably doing the computer's job

-- "extension" on news.ycombinator.org

Syndicated 2010-12-12 13:01:15 (Updated 2010-12-14 04:06:09) from apenwarr - Business is Programming

21 Nov 2010 »

Canada and the Guaranteed Annual Income idea

The Globe and Mail recently published an article titled To end poverty, guarantee everyone in Canada $20,000 a year. (Note: I think the $20,000 is a made-up number but I'll keep using it here for illustrative purposes.) It was then picked up in the comments at YCombinator news.

Unfortunately, both the article and the comments seem to misrepresent the actual proposal. A better reference is Income Security for All Canadians (pdf) that explains in more detail.¹

But you don't necessarily need more detail to understand why this idea is a good one. It's simple: the proposal doesn't change anything except the administrative overhead.

In Canada, true homelessness - sleeping on the street involuntarily - equals death. It simply gets too cold here. If you don't want people to die, you provide social services, whether through private organizations or through the government. And so we do, in a complex combination of both.

Whether or not you agree with this idea - keeping people alive and healthy and, to some extent, comfortable even if they can't earn a living - can be debated, and it often is. Every country does it differently, and the rules and amounts are constantly changing.

But in any case, in Canada, the overall outcome of our complex set of rules is clear: there is a minimum guaranteed annual income. If you're not earning it, some government agency or another is willing to give it to you, if you jump through the required hoops.

The "guaranted annual income" proposal, then, is not a proposal to start guaranteeing a minimum annual income. We already have that. It's just a renaming and simplification of what we already have. Instead of a network of complicated government agencies to guarantee your annual income, and a bunch of complicated proof requirements, we just have one agency and the only proof required is that you're a Canadian.

Where does the money come from? That's easy: it comes from where it already came from. It's the same money as before, going to the same people.

What if people abuse the system? That's easy: the same thing that already happens. It's the same money and the same people as before.

Won't it cause inflation? No, because it's the same money as before.

Won't it make people lazy and not bother to get a job? Maybe, but so will the current system. And the income we're talking about is really not a comfortable income; it's just barely enough to survive. People complain regularly, and probably quite accurately, that the current welfare situation in Canada still leaves many people below the poverty line. That won't change.

Won't people just take the money and spend it all on drugs and alcohol? Probably, yes. But they would do that anyway. Such problems can't be solved with money alone.

Why do we give it to every citizen, instead of just the citizens who need it? Good question. The answer is: because the math works better that way. Here's why:

Right now, if you're taking unemployment insurance or welfare or receive a disability pension, the short version of a complex story is that every $1 you earn - by taking a job - reduces your free government income by about $1. At a really basic level, that sounds fair; we guaranteed you would get a survival income, so if you're earning money yourself, we'll supplement it, but only up to the survival amount. Why shouldn't we give more charity to someone with no job than someone with a job?

Thinking about it more deeply, though, the answer becomes clear: because if you take away money from people when they earn money, they have no incentive to earn money.² A 20 hour a week job and $20k/year is worse than no job and $20k/year. Sure, maybe with a 40 hour a week job, you could earn $40k/year, which is more than the basic welfare amount; but to get there, you have to get started. A lot of the people who need welfare couldn't just go get a job that, on day 1, would pay more than $20k/year. They have to gain experience, get promoted, and so on. Sure, in the long term it might pay off, but if everyone was good at long term planning, we wouldn't be in this mess.

What about the rich? Are we going to give them $20k/year too? Technically yes. But just as technically, we'd obviously raise the higher tax brackets so that their effective income would be the same as before. Progressive taxation makes sense; if you earn $1, the government takes away $0.40 or whatever, but you still have $0.60 more than if you didn't earn that $1. So you still have an incentive to work more.

In summary, the main advantages of the guaranteed annual income proposal are:

it doesn't actually cost money;
it massively simplifies administration (at least in theory);
it reduces the dehumanization of proving you're needy before we'll help you (note to Ayn Rand lovers: her main complaint about welfare was this part - "to each according to their needs" - not the part where we keep people from starving);
it means that people who are too far gone (drugs, etc) to currently file for welfare could potentially get the help they're entitled to because accessing it becomes easier;
it "sounds fairer" than just giving a lot of rich people's money to poor people; we're giving money equally to everyone, rich or poor;
it combines the advantages of socialism (less starving dying homeless people means rich people can be happier and guilt-free) with the advantages of capitalism (more work leads to more income, as it should).

I am not aware of any major disadvantages of such a plan. Let me know if you think of any realistic ones and I can add them here.

It's very interesting to me that it's the conservatively-minded politicians in both Canada and the United States that seem to be working on this idea; I would have expected it to be a more liberal suggestion. Perhaps the reduced administrative overhead ("smaller government") is appealing.

A government that managed to pass such an overhaul would certainly be remembered forever in Canadian history. Um, in a good way. I think.

Footnotes

¹ If you really want a lot of detail, I recommend A Fair Country, by John Ralston Saul. It's a really good book if you want to understand where Canadian culture comes from. (Spoiler: it comes largely from our aboriginal population. Bet you didn't see that coming. Neither did I. But he convinced me. Quiz question: which of our supposed primary influences, the British or the French, is the reason we're so polite? So accepting of other cultures? So biased toward peacekeeping instead of empire-building? Answer: neither of them, obviously.) That book was where I first heard of the guaranteed annual income proposal in a well-presented way.

² To be honest, I'm a little shocked and/or disbelieving that it works this way, because it's so obviously stupid. But I've talked to people on the receiving ends of these programmes, and more than once I've heard that there's no point for them to get a job, because if they did, they'd be cheating themselves out of free money. A rather obvious halfway implementation of the guaranteed annual income scheme would be to simply deduct only a fraction of your earnings from your government supplements; it's not as elegant as the whole deal, but at least you aren't outright discouraging people from getting a job. (If that's already what happens and I'm misinformed, well, maybe that's why this GAI idea is necessary; because it's so complicated that even the people receiving these services don't understand them. I admit to never having collected welfare, disability, or unemployment insurance, so I have no real experience here.)

Syndicated 2010-11-21 07:30:01 from apenwarr - Business is Programming

17 Nov 2010 »

On sheep, wolves, and sheepdogs

I never thought I would have anything nice to say about an article that encourages more people to carry guns, including into church, but this one is very well written and makes good points:

"I will never be caught without my gun in church." I asked why he felt so strongly about this, and he told me about a cop he knew who was at a church massacre in Ft. Worth, Texas in 1999. In that incident, a mentally deranged individual came into the church and opened fire, gunning down fourteen people. He said that officer believed he could have saved every life that day if he had been carrying his gun. His own son was shot, and all he could do was throw himself on the boy's body and wait to die. That cop looked me in the eye and said, "Do you have any idea how hard it would be to live with yourself after that?"

Even more than the United States, Canadian citizens have a "sheep" mentality in the terminology of the article. Most individual Canadian citizens would be thoroughly unprepared if violence broke out. As a country, we also spend vastly less per capita on our military. Of course, it works for us; as a country, we get into fewer wars, and as individuals, we get into fewer fights, and in both cases our fights are (at least somewhat) less violent and better justified.

I've often thought that Canada is cheating a bit, however. The reason Canada doesn't need a huge military is that if someone seriously thought about attacking us, the U.S. is right next door. The Canadian pacifist system is highly efficient and desirable - as long as someone bigger and stronger is nearby to scare off the predators.

Syndicated 2010-11-12 09:01:31 from apenwarr - Business is Programming

14 Nov 2010 »

Focus

I'm a little late to the party, but this article by Derek Sivers is really good. I especially like the top part, about focus. Not because I expect this advice will do anyone any good (I think you're either driven and focussed or you're not), but because it gives me insight into my own behaviour:

You're surrounded by cool tempting people, hanging out casually, telling you to relax.
But the casual ones end up having casual talent and merely casual lives.
Looking back, my only Berklee classmates that got successful were the ones who were fiercely focused, determined, and undistractable.
[...]
When you emerge in a few years, you can ask someone what you missed, and you'll find it can be summed up in a few minutes.
The rest was noise you'll be proud you avoided.

-- Derek Sivers

Ironically, I read this because I was distracted from real work.

Syndicated 2010-11-10 23:02:26 from apenwarr - Business is Programming

11 Nov 2010 (updated 12 Nov 2010 at 03:03 UTC) »

A quick review of btrfs

Yesterday I tried out btrfs, an experimental new Linux filesystem that seems to be based on many of the ideas of the more famous "zfs," which I gather is either not available on Linux or not properly ported or not licensed correctly or whatever. I really didn't do my zfs research; all I know is it's not in the 2.6.36 Linux kernel, but btrfs is, so btrfs wins.

Anyway, the particular problem I wanted to solve was for our startup, EQL Data. Basically we end up with some pretty big mdb files - a gigabyte or two, some days - and we want to be able to "branch" and "merge" them, which we do using git. Perhaps someday I'll write an article about that. Meanwhile, the branching and merging works fine, but what doesn't work fine is creating multiple copies so that different people can modify the different branches in different ways. It's not that bad, but copying a gigabyte file around can take a few seconds - more on a busy virtual server - and so if we want startup times to be reasonable, we've had to resort to pre-copying the files in advance and keeping them in a depot so that people can grab one, modify it, and send it back for processing. This works, but wastes a lot of background I/O (which we're starting to run out of) as well as disk space (which directly costs money).

So anyway, the primary thing that I wanted was the ability to make O(1) copy-on-write copies of large files. btrfs is advertised to do this. I won't keep you in suspense: it works! There's a simple btrfs command called "bcp" (btrfs cp), and it works the way you'd think. Only modified blocks take any extra disk space.

Our second problem relates to the way we actually run the databases. We have a complex mix of X, VNC, flash, wine, Microsoft Access¹, and some custom scripts. Each session runs in its own little sandbox, because as you might have heard, Windows apps like to think they own the world and scribble all over everything, especially the registry, which we have to undo after each session before we can start the next one.

We've tried various different cleanup techniques; the problem is that each wine tree consists of thousands of files, so even checking if a particular tree has been modified can be a pretty disk-grindy operation. With experimentation, we actually found out that it can be faster to just remove the whole tree and copy a new one, rather than checking whether it's up to date. This is because the original tree is usually in cache, but the old one often isn't, and reading stuff from disk is much slower than writing (because writeback caching lets you minimize disk seeks, while you can't really do this when reading). Nevertheless, it still grinds the disk,² and it also wastes lots of disk space, though not as much as you might expect, since we use hardlinks extensively for individual files. Too bad we can't hard link directories like Apple does in Time Machine.

Luckily, btrfs can come to the rescue again, with O(1) writable snapshots of entire subvolumes. This is slightly less convenient than using bcp on a single file. Snapshots don't really act much like hardlinks; they work by quietly creating an actual separate filesystem object and showing it as if it's mounted on a subdir of your original filesystem. (This doesn't show up in /proc/mounts, but if you stat() a file in the copy, it shows a new device id.) Presumably this is because otherwise, you'd end up with files that have the same device+inode pair, which would confuse programs, and you can't correctly renumber all the inodes without going from O(1) to O(n) (obviously). So all in all, it's clever, but it takes more steps to make it work. It still runs instantaneously once you figure it out, and the extra steps aren't that bad. It means we can now have the ability to clone new virtual environments instantly, and without taking any extra disk space. Nice!

Oddly, in my 2.6.36 kernel (plain btrfs as included in Linus's pristine kernel), I could *create* snapshots as a normal user (once I had set things properly with chmod), but I couldn't *delete* them as a normal user; I had to be root. This can't really be considered correct behaviour, so I might have to file a bug or join the mailing list or something. Later.

Other observations:

creating a filesystem was instantaneous; no boring writing of inode tables like in ext2/ext3.
mounting a filesystem was also instantaneous; no journal flushing or scanning.
creating large files seems to cause a strange burp after writing 128 megs or so; it stops for a rather long time, then resumes writing. Perhaps this has to do with allocating a new extent or something. I'm not sure yet whether this is normal or repeatable behaviour, and it probably won't affect my use case all that much.
despite the aforementioned burp, creating large files seems somewhat faster than on ext3 (my imagination?) and deleting them happens instantly, instead of after a lot of disk grinding with ext3. This seems to be the magic of the extent-based disk allocation again, I suppose.
adding and removing devices in the volume (I was using /dev/loop devices for testing) works great; removing is obviously slower than adding, since it has to move data to the remaining devices, but it seemed flawless. The filesystem stayed fully operational (and not noticeably slower) the whole time, which implies that they have their I/O priorities right.
Slicehost doesn't compile btrfs into their standard kernels, apparently; I'll have to see if I can make it work. If not, it's time to switch hosting providers. I did my testing on my own machine at home.
btrfs has big loud warnings everywhere about how unstable it is and how the on-disk format is likely to change. Since all our usage will be with short-lived session files, this shouldn't be a problem for us... but be careful if you get excited about using it. I'm not sure I'd trust it as my root fs right now, at least not if I plan to ever upgrade my kernel.

Short version: btrfs is very nicely done. Will it be the next major Linux filesystem? Well, I bet on reiserfs last time, and that didn't really work out, what with the bugs and the insanity and the murder. Nevertheless, I'm optimistic. At the very least, maybe it'll give some good ideas to the ext4 guys.

Next I'm going to start looking into some stress tests and figuring out if we can safely take it into production (again, in our controlled environment where loss of a filesystem and/or incompatible filesystem format changes aren't the end of the world).

Footnotes

¹ Yes, we have all the required licenses. Why does everyone assume that just because we're running wine, we must be slackers trying to dodge license agreements? We're using it because hosting on Linux is vastly more memory-efficient than Windows. Also I just like Linux better.

² Nowadays it grinds the disk less than it used to, because in the newer kernels, Slicehost has switched to using ext3's data=writeback instead of data=ordered. The tech support guy I talked to a few months ago said he had never heard of this, and no, there was no way to do it. I wonder if they changed their setting globally because of me?

Syndicated 2010-11-08 22:08:59 (Updated 2010-11-12 03:03:32) from apenwarr - Business is Programming

9 Nov 2010 (updated 9 Nov 2010 at 07:09 UTC) »

More on 802.11 wireless

As part of my top secret plans to try to make a space-age new wireless router, I've decided to wade through the official IEEE 802.11 specification.

Now okay, I decided that before I actually found out the thing is 1233 pages long, so I might yet change my mind. And let me tell you, IEEE reading isn't quite as captivating as IETF reading. There are at least a couple dozen pages of definitions, like "STA" as the abbreviation for "station," because there is apparently a worldwide shortage of lowercase letters.

Word to the wise: if you're interested in this spec, you might want to start at section 5, which actually gives a decent architectural overview. I actually read the entire definitions section before I got there, which was confusing and maybe non-optimal, but I do feel like I recognize a lot more of the unnecessary acronyms now.

My major goal when I started reading the spec was to find the answers to these two questions:

Is there any reason I can't join more than one wireless network at once?
If I forward packets from one network to another, will it cause interference and packet loss? And if so, can that be avoided somehow?

I'm only at page 42, and I don't have firm answers on these yet. But I feel like I'm getting there.

Before you ask, the answer to the first question is definitely not "you can join as many networks as you have antennas." I know enough electrical engineering to know why that's nonsense, since I was somehow granted a Bachelor's degree in such things; just enough knowledge to be dangerous. But even if the details are fuzzy, let's try this thought experiment:

Back in the before-times, people used to have these analog powered whatzits called "televisions" which would receive "signals" from the "airwaves" using "antennas." Some of these antennas had cutesy sounding names like "rabbit ears," presumably so that people would be allowed to bring them in the house and ruin the careful job the Local Interior Design Authority had done arranging the furniture.

But if you really got fancy, you could get a big fancy antenna and mount it outside somewhere. Then you could put a splitter on the wire from that antenna, and run its signal to more than one TV at a time. Or to your "VCR," which could record one channel even while you watched a totally different one!!! All with a single antenna!!!

I know, it sounds like science fiction, but bear with me, because I clearly remember it, in an abstract sort of way, from my childhood.

(If you used a splitter, you ended up getting less signal strength to each of your receiving devices. But let's ignore that factor here; that's just an analog artifact, similar to what happens when you copy from one "video tape" to another (another item of science fiction; someday, we'll uninvent DRM and get this sort of stuff back). If the antenna is connected to just one digital signal processor, we should be able to mangle it a million different ways and not worry about analog losses.)

So anyway, as much as technology changes, it still mostly seems to stay the same. 802.11 channels are a lot like TV channels; each one gets its own little band of the frequency spectrum. (I was a little surprised that such a recent technology didn't bother using spread spectrum / frequency hopping stuff, but that's how it is.) Thus, just like with your old TV network, you should be able to use a single antenna and receive as many channels as you want.

Relatedly, it seems that 802.11n gains most of its speed by using multiple channels at once. I haven't gotten to that part of the spec yet; I read it elsewhere. But I notice from my online browsing that there are 802.11n "lite" routers with only one antenna, and 802.11n "real" routers with two or three. I think this is pretty theoretically bogus - one antenna ought to be enough for anyone - but probably practically does make a difference.

Why? Because I have a feeling the chipset manufacturers are still in the past. The problem is, sending/receiving on multiple channels at once is kind of hard to do, even if you're working in a purely digital world. At the very least, you need a much higher clock frequency on your DSP to handle multiple full-rate baseband signals simultaneously. But worse, I don't know how much of this stuff is purely digital; they're probably still using analog modulators/demodulators and whatnot. If so, it's probably hard to modulate/demodulate multiple channels at once without using an analog splitter and multiple analog modulators... which would degrade the signal, just like it did with your old TV antenna.

It sounds to me like a solvable problem, but without having yet looked at the hardware/software that implements this stuff, I'm guessing it hasn't been solved yet. This is some pretty leading-edge signal processing stuff, and cheapskates like you are only willing to pay $50-$75 for it, which makes it extra hard. So it was probably just easier to mount multiple antennas and include multiple DSP cores and modulators - in fact, maybe just throw in the same Broadcom chip more than once on the motherboard - and just run them simultaneously. Not optimal, but easier, which means they got to market faster. Expect single-antenna, full rate 802.11n boxes eventually.

So from the above reasoning - all unconfirmed for now - I conclude that, even still, you ought to be able to send/receive on as many channels as you have antennas. And if there's more than one wireless network (SSID) on a single channel, you should be able to join all those wireless networks at once using only one antenna.

As it happens, already by page 42 of the spec I've read the part where it says you absolutely must not join more than one network (literally, "associate with more than one AP") at a time. Party poopers.

But why? The stated reason for the rule is that otherwise, alas, the poor helpless network infrastructure won't know which AP to route through when it looks for your MAC address and multiple APs respond that they're connected to it. But that actually can't be true, because shortly after, they say that you "must attempt to send a disassociate message" when leaving an AP, while admitting that's kind of impossible to do that reliably, since the reason you're leaving might be that you went out of signal range, and how would you know that in advance? Thus, if you're carrying your laptop around and you move out of range of one AP and into range of another and you don't get to disassociate from the first one, the network must be able to handle it, and therefore by extension, it can handle it if you deliberately join more than one network, since the network won't know the difference.

Apparently the guys down at the IEEE 802.11 working group have never heard of crash-only programming; there never should have been a disassociate command in the first place, just like having a DHCP "release my IP address" command was a stupid idea.

Anyway, question #1 looks promising; it looks like a software hack could let us join multiple networks at once. And systems with multiple antennas could even join multiple networks on multiple channels, perhaps.

For my second question, about forwarding packets from one network to another, things are much more screwy. I suspect that forwarding packets between two networks on the same channel will be a problem unless you're careful (ie. receive packet on A, send it out on B, but someone sends the next packet on A while you're sending on B and they interfere), because the APs on the two networks can't easily coordinate any collision control. On separate non-interfering channels it should be okay, of course. But I'll need to read much more before I can conclude anything here.

Interestingly, the standard has accrued a whole bunch of QoS stuff, supposedly designed for real-time audio and video. I doubt that will go anywhere, because overprovisioning is much simpler, especially on a LAN. But the otherwise-probably-pointless QoS stuff includes some interesting timeslot-oriented transmit algorithms (don't expect the 802.11 guys to ever say "token ring") that might be fudgeable for this kind of forwarding. We could just reserve alternate timeslots on alternate networks, thus avoiding overlap. Maybe.

I bet nobody implements the QoS stuff correctly, though, which is why every router I've seen lets you turn it off.

Other interesting things about 802.11

You might know that WEP stands for "wired equivalent privacy." After reading the spec - which mentions in a few places that WEP is deprecated, by the way, which is wise since it was hacked long ago - I think I see where they got that strange name. See, they correctly noted that all IEEE 802 networks (like ethernet) are pretty insecure; if you can plug in, you can see packets that aren't yours. And the world gets along even so; that's why they invented ssh, which is why I invented sshuttle, and so on. You don't need ethernet-layer security to have application-layer security.

However, they didn't want to make it even worse. The theory at the time they were inventing 802.11 must have been this: the security requirement that "they must be able to physically plug in a wire" isn't very strong, but it's strong enough; it means someone has to physically access our office. By the time they can do that, they can steal paper files too. So most people are happy with wired-level security. With wireless, it goes one step too far; someone standing outside our locked office door could join our office network. That's not good enough, so we have to improve it.

And they decided to improve it: exactly to the same level (they thought) as a wired network. Which is to say, pretty crappy, but not as crappy.

From what I can see, WEP is simply this: everybody on your network takes the same preshared key to encrypt and decrypt all the packets; thus everybody on the network can see everybody else's packets; thus it's exactly as good as (and no better than) a wire. Knowing the digital key is equivalent to having the physical key to the office door, which would let you plug stuff in.

And actually that would have been fine. Wired-equivalent security really is good enough, mostly, on a private network. (If you're in an internet cafe, well, mere wires wouldn't save you, and neither will WEP or WPA2. Imagine that someone has hacked the router.) Unfortunately WEP ended up having some bugs (aka "guess we should have hired a better security consultant") that made it not as good as wired. Reading between the lines of the spec, I gather that one major flaw in WEP is replay attacks: even if someone doesn't have the key, they can replay old packets, which can trick hosts into doing various things even if you yourself can't read the packet contents. You can't do that on a wired network, and therefore WEP isn't "wired-equivalent privacy" at all.

So anyway, all that was interesting because I hadn't realized that WEP wasn't even supposed to be good. The only problem was it was even worse than it was supposed to be, which put it over the edge. The result was the massive overcorrection that became WPA, which as far as I can tell ends up being overkill and horrendously complex, reminiscent of IPsec.

Admittedly I haven't read all the way ahead to WPA though, and the fact that lots of people have implemented it successfully (and interoperably!) kind of implies that it's a better standard than IPsec. (Still: see my previous post for an example of how either dd-wrt or Apple Airport Express apparently still *doesn't* implement it correctly.)

...

The WEP thing is also a good example of a general trend I'm observing while reading the spec: 802.11 does a lot of stuff that really doesn't belong at the low-level network layer. Now, the original "OSI protocol stack" has long been discredited - despite still being taught in my horrible university courses in 2001 and maybe beyond - but the overall idea of your network stack being a "stack" is still reasonable. The whole debate about network stacks comes down to this: higher layers always end up needing to assume things about lower layers, and those assumptions always end up causing your "stack" to become more of a "mishmash."

Without necessarily realizing it, this happened with the world's most common network stack: ethernet + IP + TCP.

First, people have been assuming that ethernet is "pretty secure" (ie. if you're on a LAN, encryption isn't needed). Second, TCP implicitly assumes that ethernet has very low packet loss - packet loss is assumed to mean Internet congestion, which is not true on a wireless network. And third, most IP setups assume that a given ethernet address will always be on the same physical LAN segment, which is how we should route to a particular IP address.

The 802.11 guys - probably correctly - decided that it's way too late to fix those assumptions; they're embedded in pretty much every network and every application on the Internet. So instead, they hacked up the 802.11 standard to make wireless networks act like ethernet. That means wired-equivalent (and with WPA, better-than-wired-equivalent) encryption to bring back the security; device-level retransmits before TCP ever sees a lost packet; association/disassociation madness to let your MAC address hop around, carrying its IP address with it.

It's kind of sad, really, because it means my network now has two retransmit layers, two encryption layers, and two routing layers. All three of those decrease debuggability, increase complexity (and thus the chance of bugs), increase the minimum code size for any router, and increase the amount of jitter that might be seen by my application for a random packet.

Would the world be a better place if we turned off all this link-layer stuff and just reimagined TCP and other protocols based on the new assumptions? I don't know. I suppose it doesn't matter, since I'm pretty sure we're stuck with it at this point.

...

Oh, there was one bit of good news too: 802.11 looks like it's designed well enough to be used for all sorts of different physical wireless transports. That is, it looks like they can switch frequencies, increase bandwidth, reduce power usage, etc. without major changes to the standard, in the same way that ethernet standards have been recycled (with changes, but surprisingly small ones) up to a gigabit (with and without optical fibre) and beyond.

So all this time developers have spent getting their 802.11 software stacks working properly? It won't be wasted next time we upgrade. 802.11 is going to be around for a long, long time.

Update 2010/11/09: Note that a perfectly legitimate reason to have more than one antenna is to improve signal reception. I don't know if that's what routers are actually doing - I half suspect that the venerable WRT54G, for example, just has them to give the impression of better reception - but it's at least possible. The idea of multiple antennas to allow subtracting out the noise goes all the way back to the old days of TV rabbit ears, which generally had two separate antenna arms. Or ears, I guess. The math is a bit beyond me, but I can believe it works. My point was that you shouldn't, in theory, need multiple antennas to use multiple channels.

Syndicated 2010-11-07 07:29:14 (Updated 2010-11-09 07:09:17) from apenwarr - Business is Programming

574 older entries...