Older blog entries for apenwarr (starting at number 527)

Sad moments in mathematics

I can't seem to find any actual use in the fact that 32 is 2 to the power of 5. You'd think that since 2 and 5 are so much smaller than 32, and powers of 2 are so rare, that you could use this information to compress the representation somehow. But no.

Syndicated 2009-11-12 19:29:42 from apenwarr - Business is Programming

Bittorrent, negative latency, and feedback control theory

Once upon a time, long long ago, I convinced mag to build a router-based automatic traffic shaper for Nitix based on control theory.

The basic idea was simple enough: the Linux kernel supports traffic shaping, which allows you to limit and control the amount of data you send/receive. Limiting the data you receive isn't all that useful, as it turns out, but limiting the send rate can be very useful.

If you transmit data at the maximum rate (say, 50k/sec), you'll end up filling your DSL modem's buffer, and then everything you transmit ends up having a multi-second delay, which results in horrendous latency.

If you transmit data at just slightly less than the maximum rate, say 49.9/sec, the buffer never fills up at all, and your latency is still the minimum. So it's not using your link that makes things unresponsive; it's overfilling the transmit buffer.

The problem: you don't actually know what your uplink rate is, so picking that 99% rate automatically isn't easy. That's why BitTorrent clients let you limit your uplink speed.

At NITI, we observed that latency creeps up right away when you exceed the maximum rate. So we ought to be able to detect that maximum rate by monitoring the latency and using that as feedback into a bandwidth limiter. Basically, a simple feedback control system.

This almost, but not quite, worked. It would in fact work great most of the time, but eventually it would always go into a crazy state in which it kept reducing the transmit rate without having any luck reducing the bandwidth... so it would reduce the transmit rate further out of desperation, and so on. The results made it basically unusable. Too bad. (We never had enough time to fully debug it... some other priority always got in the way.)

Moreover, it wasn't any use to you if you didn't have Nitix.

Anyway, all this is to say that the Bittorrent people have been thinking about the same problems lately, and have supposedly solved it as part of the uTorrent Transport Protocol (UTP). (There's also an IETF protocol called LEDBAT that seems to be related.)

Their approach is similar to what we were doing, but has a few changes that make it more likely to actually work.

First of all, they assume the "minimum achievable latency" is the lowest latency you've seen in the last 3 minutes. Rather than using averages, they observe that if the transmit buffer is always near-empty, then sooner or later you'll get a packet through without any buffer delay. The delay of that packet is the actual network latency; on top of that, anything extra is buffering delay.

Secondly, because they're coming up with a whole new protocol rather than throttling existing TCP sessions, they can add a timestamp to each packet. Basically, that means they can figure out the one-way latency without sending extra packets. Our system required sending out ping packets, which could only measure the full round-trip time (when really you need to measure each direction independently). They also know when they're transmitting at the maximum allowed rate and when they're mostly idle, so they can keep their statistics straight.

Furthermore, their approach focuses on the core of the problem: don't bother limiting overall upload throughput, just limit the rude part of the throughput. They've correctly noted that, almost always, when transmit buffers cause a problem, it's because of BitTorrent. Other than that, almost nobody uses much upload bandwidth at all. So they've limited their solution to only the BitTorrent protocol. That way they don't have to convince anyone else (router manufacturers, operating system kernels, etc) to support their standard.

Now, at last, BitTorrent can be polite. BitTorrent uploads are almost always the lowest-priority thing you could possibly be doing. So it's okay that it always loses out to the slightly-less-polite TCP. (Apparently TCP Vegas is a more polite version of TCP that would accomplish the same thing... if everybody used it. But it requires kernel support, and most kernels supposedly make you choose Vegas globally for all connections, not just for low-priority ones. Which you will never do, because it'll make your whole computer lower priority than everybody else's computers, and thus your personal Internet performance will suck.)

Negative latency and background transmissions

The ability to send data truly "in the background" without interfering with high-priority foreground communications is important. It allows you to implement what I call "negative latency" - transmission of data before anyone requests it.

Disks are getting bigger and bigger, and many computers spend a lot of time sitting idle on the Internet. During that time, they could be "pre-sending" data you might need later. If sending that data had no cost, then even if 99% of the data turned out to be useless, you'd still have a 1% improvement that is worthwhile. And personally, I think a much better than 1% success rate should be possible.

I'm looking forward to it.

Syndicated 2009-10-25 19:24:53 from apenwarr - Business is Programming

Linux in a Nutshell, 6th Edition...

...has a new chapter about Git, courtesy of me.

Sorry for the late notice. I keep thinking of awesome stuff to write about, but not quite getting around to it because I'm too busy. Somehow it's the opposite of writer's block, but has the same net effect.

As computer books go, Linux in a Nutshell is surprisingly awesome. I've been using Linux pretty heavily since 1994, but I can still flip to a random page in this book and learn something new.

Unless I flip to the Git chapter, of course.

Syndicated 2009-10-25 18:39:27 from apenwarr - Business is Programming

Paul Buchheit on "Hacking"

Buchheit has produced a really good article that, at last, clearly describes the nature of "hacking."

I especially like how he handles the debate about hacking being a good thing ("clever hack") or a bad thing (eg. illegal breakins). Some people propose that we use two different words ("hacker" and "cracker"), but those never quite feel right. The essay explains why.

    Once the actual rules are known, it may be possible to perform "miracles" -- things which violate the perceived rules. [...] Although this terminology is occasionally disputed, I think it is essentially correct -- these hackers are discovering the actual rules of the computer systems (e.g. buffer overflows), and using them to circumvent the intended rules of the system (typically access controls).

    -- Paul Buchheit, Applied Philosophy, aka Hacking

Syndicated 2009-10-14 17:37:12 from apenwarr - Business is Programming

Forgetting

    Thinking is based on selection and weeding out; remembering everything is strangely similar to forgetting everything. "Maybe most things that people do shouldn't be remembered," Jellinghaus says. "Maybe forgetting is good."

    -- Wired, The Curse of Xanadu2

Computers are all about control. 1950's sci-fi was all about the fear of artificial intelligence; about what would happen if every decision was all about logic. But the strangeness of computing is much deeper than that. In a computer, you can change the rules of the universe, just to see what happens. And those changes will reflect onto the outside world.

One of those rule changes is version control. Humans have an overwhelming tendency to forget things; mostly we remember just one version of a person, the last we saw of them, or at best a series of snapshots. Most things that happened to us, we forget altogether. In the short term, we remember a lot; in the long term, we remember less; in the last 10 seconds, we can often replay it verbatim in our heads.

Computers are different. If we want a computer to remember something, we have to tell it to remember. If we want it to forget something, we have to tell it to forget. Because we find that tedious, we define standard rules for remembering and forgetting. With the exception of a few projects, like Xanadu or the Plan 9 WORM filesystem, the standard rules are: remember the current version. Forget everything from before. Some programs, like wikis and banking systems, don't follow the standard rules. For each of those programs, someone wrote explicit code for what to remember and what to forget.

But the standard rules are on the verge of changing.

Cheap disks are now so unbelievably gigantic that most people can only do one of two things with them: pirate more full-frame video content than they will ever have the time to watch, or simply stop deleting stuff. Many people do both.

But another option is starting to emerge: storing old revisions. The technology is advancing fast, and for some sorts of files, systems like git can store their complete history in less than the space of the latest uncompressed data. People never used to think that was even possible; now it's typical.

For some sorts of files, the compression isn't good enough. For large files, you have to use tricks that haven't been finalized yet. And git cheats a little: it doesn't really store every revision. It only stores the revisions you tell it to. For a programmer, that's easy, but for normal people, it's too hard. If you really saved every revision, you'd use even more space, and you'd never be able to find anything.

Back at NITI, we invented (and then patented) a backup system with a clever expiry algorithm based on the human mind: roughly speaking, it backs up constantly, but keeps more of the recent versions and throws away more of the older ones. So you have one revision every few minutes today and yesterday, but only one for the day before, and only one for last week, one for last month and the month before that, etc.1

As it happens, the backup system we invented wasn't as smart as git. It duplicated quite a lot of data, thus wasting lots of disk space, in order to make it easier to forget old versions. Git's "object pack" scheme is much more clever, but git has a problem: it only knows how to add new items to the history. It doesn't know how to forget.

But as with so many things about git, that's not entirely true.

Git forgets things frequently. In fact, even when git is forgetting things, it's cleverer than most programs. Git is the only program I've ever seen that uses on-disk garbage collection. Whenever it generates a temporary object, it just writes it to its object store. Then it creates trees of those objects, and writes the tree indexes to the object store. And then it links those trees into a sequence of commits, and stores them in the object store. And if you created a temporary object that doesn't end up in a commit? Then the object sticks around until the next git gc - garbage collection.

When I wrote my earlier article about version control for huge files, some people commented that this is great, but it's not really useful as a backup system, because you can't afford to keep every single revision. This is true. The ideal backup system features not just remembering, but forgetting.

Git is actually capable of forgetting; there are tools like git subtree, for pulling out parts of the tree, and git filter-branch, for pulling out parts of your history.

Those tools are still too complicated for normal humans to operate. But someday, someone will write a git skiplist that indexes your commits in a way that lets you drop some out from the middle without breaking future merges. It's not that hard.

When git can handle large files, and git learns to forget, then it'll be time to revisit those standard rules of memory. What will we do then?

Footnotes

1 Actually it was rather more complicated than that, but that's the general idea. Apple's Time Machine, which came much later, seems to use almost exactly the same algorithm, so it might be a patent violation. But that's not my problem anymore, it's IBM's, and Apple and IBM surely have a patent cross-license deal by now.

2 By the way, I first read that Xanadu article a few years ago, and it's completely captivating. You should read it. Just watch out: it's long.

Syndicated 2009-10-06 03:32:59 from apenwarr - Business is Programming

Version control of really huge files

So let's say you've got a database with a 100k rows of 1k bytes each. That comes to about 100 megs, which is a pretty small database by modern standards.

Now let's say you want to store the dumps of that database in a version control system of some sort. 100 megs is a pretty huge file by the standards of version control software. Even if you've only changed one row, some VCS programs will upload the entire new version to the server, then do the checksumming on the server side. (I'm not sure of the exact case with svn, but I'm sure it will re-upload the whole file if you check it into a newly-created branch or as a new file, even if some other branch already has a similar file.) Alternatively, git will be reasonably efficient on the wire, but only after it slurps up mind-boggling amounts of RAM trying to create a multi-level xdelta of various revisions of the file (and to do that, it needs to load multiple revisions into memory at once). It also needs you to have the complete history of all prior backups on the computer doing the upload, which is kind of silly.

Neither of those alternatives is really very good. What's a better system?

Well, rsync is a system that works pretty well for syncing small changes to giant files. It uses a rolling checksum to figure out which chunks of the giant file need to be transferred, then sends only those chunks. Like magic, this works even if the sender doesn't have the old version of the file.

Unfortunately, rsync isn't really perfect for our purposes either. First of all, it isn't really a version control system. If you want to store multiple revisions of the file, you have to make multiple copies, which is wasteful, or xdelta them, which is tedious (and potentially slow to reassemble, and makes it hard to prune intermediate versions), or check them into git, which will still melt down because your files are too big. Plus rsync really can't handle file renames properly - at all.

Okay, what about another idea: let's split the file into chunks, and check each of those blocks into git separately. Then git's delta compression won't have too much to chew on at a time, and we only have to send modified blocks...

Yes! Now we're getting somewhere. Just one catch: what happens if some bytes get inserted or removed in the middle of a file? Remember, this is a database dump: it's plaintext. If you're splitting the file into equal-sized chunks, every chunk boundary after the changed data will be different, so every chunk will have changed.

This sounds similar to the rsync+gzip problem. rsync really sucks by default on .tar.gz files, because if a single byte changes, every compressed byte after that will be different. To solve this problem, they introduced gzip --rsyncable, which uses a clever algorithm to "resync" the gzip bytestream every so often. And it works! tar.gz files compressed with --rsyncable change only a little if the uncompressed data changes only a little, so rsync goes fast. But how do they do it?

Here's how it works: gzip keeps a rolling checksum of the last, say, 32 bytes of the input file. (I haven't actually looked at what window size gzip uses.) If the last n bits of that checksum are all 0, which happens, on average, every 2^n bytes or so, then toss out the gzip dictionary and restart the compression as if that were the beginning of the file. Using this method, a chunk ends every time we see a conforming 32-byte sequence, no matter what bytes came before it.

So here's my trick: instead of doing this algorithm in gzip, I just do it myself in a standalone program. Then I write each chunk to a file, and create an index file that simply lists the filenames of the required chunks (in order). Naturally, I name each chunk after its SHA1 hash, so we get deduplication for free. (If we create the same chunk twice, it'll have the same name, so it doesn't cost us any space.)

...and to be honest, I got a little lazy when it came to creating the chunks, so I just piped them straight to git hash-object --stdin -w, which stores and compresses the objects and prints out the resulting hash codes.

An extremely preliminary experimental proof-of-concept implentation of this file splitting algorithm is on github. It works! My implementation is horrendously slow, but it will be easy to speed up; I just wrote it as naively as possible while I was waiting for the laundry to finish.

Future Work

For our purposes at EQL Data, it would be extremely cool to have the chunking algorithm split based only on primary key text, not the rest of the row. We'd also name each file based on the first primary key in the file. That way, typical chunks will tend to have the same set of rows in them, and git's normal xdelta stuff (now dealing with a bunch of small files instead of one huge one) would be super-efficient.

It would also be entertaining to add this sort of chunking directly into git, so that it could handle huge files without barfing. That would require some changes to the git object store and maybe the protocol, though, so it's not to be taken lightly.

And while we're dreaming, this technique would also be hugely beneficial to a distributed filesystem that only wants to download some revisions, rather than all of them. git's current delta compression works great if you always want the complete history, but that's not so fantastic if your full history is a terabyte and one commit is 100 GB. A distributed filesystem is going to have to be able to handle sparse histories, and this chunking could help.

Prior Art

I came up with this scheme myself, obviously heavily influenced by git and rsync. Naturally, once I knew the right keywords to search for, it turned out that the same chunking algorithm has already been done: A Low-Bandwidth Network Filesystem. (The filesystem itself looks like a bit of a dead end. But they chunk the files the same way I did and save themselves a lot of bandwidth by doing so.)

Syndicated 2009-10-04 02:54:36 from apenwarr - Business is Programming

Off-by-one Limerick

There once was a monkey from Nantuck
et, who gathered his water by buck
   et. When he tripped and fell
   down, he would say, "Farewell,
clowns!" and find a young woodchuck and chuck
it.

Syndicated 2009-09-13 02:39:50 from apenwarr - Business is Programming

Introducing jfauth: Just Fast Authentication

I have a simple problem. I just want to set up a bunch of Linux machines on my LAN, and I want them to check all their passwords against a central server.

Okay, my problem is a little more complicated than that: each machine has only a subset of users, and I don't want my whole network to drop dead just because the central server goes down for a few minutes.

Oh, and I don't trust all the users on all the machines, so I can't reveal the central /etc/shadow to them, so any simple replication scheme involving rsyncing /etc/shadow is out... even if that would ever be a good idea.

Also, I'm using apache_mod_pam, which does a PAM authorization for every single page (or image file, or whatever). That's fast enough when using pam_unix to read /etc/shadow, but it's sucktastic if you use any kind of remote authentication.

Did I mention I have no idea how to properly configure a secure LDAP server? And would it be okay if I wanted the central server to actually be part of a Windows domain controller network and check all the passwords against Windows?

Yeah. I didn't think so.

The Solution

I've had some variation of the above set of problems for years. In Nitix we solved at least some of them using pam_uniconf and nss_uniconf and a wvauthdaemon and some more PAM modules and an auto-configuring LDAP server. It was kind of cool, actually, but it grew into a bit of a mess, and then NITI (which included me, at the time, oops) dropped the final bomb: we started charging extra for each user license. Ouch.

As it turns out, there's an easier way and it involves basically none of that stuff. The problem is, nobody had actually implemented that easier way before. So I spent the last couple of days doing it. The result is jfauth: Just Fast Authentication.

In a nutshell:

  • There's a jfauthd server that listens on Unix and (optional) SSL sockets.
  • There's a jfauth client program that connects to jfauthd.
  • There's a pam_jfauth module that connects to jfauthd.
  • One jfauthd can connect to another over SSL.
  • jfauthd itself authenticates against any PAM service you want, including pam_unix, pam_ldap, and pam_winbind (which lets you use a domain controller)
  • jfauthd caches successful authentications for variable time (default 60 seconds), so apache_mod_pam goes fast even if your PAM provider is slow.
  • jfauthd can use its cache for an even longer time (default infinite) if the master server goes down temporarily.
Oh, and for fun, there's an optional mode where it can automatically run smbpasswd for you whenever someone successfully authenticates, because nobody ever remembers to set their samba password correctly, least of all me.

There are packages for Debian (etch and lenny) and Ubuntu. The whole binary package is under 50k. I can't believe I waited so long to do this.

Much of the awesomeness, particularly the SSL awesomeness (thanks, ppatters and others) is due to WvStreams.

You can read the README for more information.

Troublesome bits

The caching isn't persistent across daemon restarts (yet, anyway). We also don't do any validation of the server's SSL certificate, so it's subject to a man-in-the-middle attack. (I hear there are tools for that.) Although I was pretty careful, there may be other problems too, probably security related. Use at your own risk, etc.

And let me know what you think!

Syndicated 2009-09-03 21:41:21 from apenwarr - Business is Programming

Microsoft Access + Git = 14x file size reduction

Okay, the cat's apparently out of the bag. I didn't even know these videos had been posted (possibly because I don't pay enough attention), but for once I'm actually rather pleased with how one of my presentations sounds in a recording. This is me at Montreal NewTech in June, talking about some of the technology behind EQL Data.

The first part is a loose history of Microsoft Access and its file format, including some great facts I made up:

The person you can hear giggling in the background occasionally is pphaneuf.

The second part talks about... yes... what happens when you store the entire history of an Access database in Git. (Hint: good things.)

Sadly, this presentation was made a few short weeks before we released EQL Access OnWeb, which actually literally executes Microsoft Access (actually the free Access Runtime) and displays it in a web browser. That would have made a pretty cool demo! Oh well, maybe next time.

Syndicated 2009-08-19 03:23:23 from apenwarr - Business is Programming

SO_LINGER is not the same as Apache's "lingering close"

Have you ever wondered what SO_LINGER is actually for? What TIME_WAIT does? What's the difference between FIN and RST, anyway? Why did web browsers have to have pipelining disabled for so long? Why did all the original Internet protocols have a "Quit" command, when the client could have just closed the socket and been done with it?1

I've been curious about all those questions at different points in the past. Today we ran headlong into all of them at once while testing the HTTP client in EQL Data.

If you already know about SO_LINGER problems, then that probably doesn't surprise you; virtually the only time anybody cares about SO_LINGER is with HTTP. Specifically, with HTTP pipelining. And even more specifically, when an HTTP server decides to disconnect you after a fixed number of requests, even if there are more in the pipeline.

Here's what happens:

  • Client sends request #1
  • Client sends request #2
  • ...
  • Client sends request #100
  • All those requests finally arrive at the server side, thanks to network latency.
  • Server sends response #1
  • ...
  • Server sends response #10
  • Server disconnects, because it only handles 10 queries per connection.
  • Server kernel sends TCP RST because userspace didn't read all the input.
  • Client kernel receives responses 1..10
  • Client reads response #1
  • ...
  • Client reads most of response #7
  • Client kernel receives RST, causing it to discard everything in the socket buffer(!!)
  • Client thinks data from response 7 is cut off, and explodes.
Clearly, this is crap. The badness arises from the last two steps: it's actually part of the TCP specification that the client has to discard the unread input data - even though that input data has safely arrived - just because it received a RST. (If the server had read all of its input before doing close(), then it would have sent FIN instead of RST, and FIN doesn't tell anyone to discard anything. So ironically, the server discarding its input data on purpose has caused the client to discard its input data by accident.)

Perfectly acceptable behaviour, by the way, would be for the client to receive a polite TCP FIN of the connection after response #10 is received. It knows that since a) the connection closed early, and b) the connection closed without error, that everything is fine, but the server didn't feel like answering any more requests. It also knows exactly where the server stopped, so there's no worrying about requests accidentally being run twice, etc. It can just open a connection and resend the failed requests.

But that's not what happened in our example above. So what do you do about it?

The "lingering close"

The obvious solution for this is what Apache calls a lingering close. As you can guess from the description, the solution is on the server side.

What you do is you change the server so, instead of shutting down its socket right away, it just does a shutdown(sock, SHUT_WR) to notify TCP/IP that it isn't planning to write any more stuff. In turn, this sends a notice to the client side, which (eventually) arrives and appears as an EOF - a clean end of data marker, right after response #10. At that point, the client can close() its socket, knowing that its input buffer is safely empty, thus sending a FIN to the server side.

Meanwhile, the server can read all the data in its input buffer and throw it away; it knows the client isn't expecting any more answers. It just needs to flush all that received stuff to avoid accidentally sending an RST and ruining everything. The server can just read until it receives its own end-of-data marker, which we now know is coming, since the client has called close().

Throw in a timeout here and there to prevent abuse, and you're set.

SO_LINGER

You know what all the above isn't? The same thing as SO_LINGER.

It seems like there are a lot of people who are confused by this. I certainly was; various Apache documentation, including the actual comment above the actual implementation of "lingering close" in Apache, implies that Apache's lingering code was written only because SO_LINGER is broken on various operating systems.

Now, I'm sure it was broken on various operating system for various reasons. But: even when it works, it doesn't solve this problem. It's actually a totally different thing.

SO_LINGER exists to solve exactly one simple problem, and only one problem: the problem that if you close() a socket after writing some stuff, close() will return right away, even if the remote end hasn't yet received everything you wrote.

This behaviour was supposed to be a feature, I'm sure. After all, the kernel has a write buffer; the remote kernel has a read buffer; it's going to do all that buffering in the background anyway and manage getting all the data from point A to point B. Why should close() arbitrarily block waiting for that data to get sent?

Well, it shouldn't, said somebody, and he made it not block, and that was the way it was. But then someone realized that there's an obscure chance that the remote end will die or disappear before all the data has been sent. In that case, the kernel can deal with it just fine, but userspace will never know about it since it has already closed the socket and moved on.

So what does SO_LINGER do? It changes close() to wait until all the data has been sent. (Or, if your socket is non-blocking, to tell you it can't close, yet, until all the data has been sent.)

What doesn't SO_LINGER do?

It doesn't read leftover data from your input buffer and throw it away, which is what Apache's lingering close does. Even with SO_LINGER, your server will still send an RST at the wrong time and confuse the client in the example above.

What do the two approaches have in common?

They both involve close() and the verb "linger." However, they linger waiting for totally different things, and they do totally different things while they linger.

What should I do with this fun new information?

If you're lucky, nothing. Apache already seems to linger correctly, whether because they eventually figured out why their linger implementation works and SO_LINGER simply doesn't, or because they were lucky, or because they were lazy. The comment in their code, though, is wrong.

If you're writing an HTTP client, hopefully nothing. Your client isn't supposed to have to do anything here; there's no special reason for you to linger (in either sense) on close, because that's exactly what an HTTP client does anyway: it reads until there's nothing more to read, then it closes the connection.

If you're writing an HTTP server: first of all, try to find an excuse not to. And if you still find yourself writing an HTTP server, make sure you linger on close. (In the Apache sense, not the SO_LINGER sense. The latter is almost certainly useless to you. As an HTTP server, what do you care if the data didn't get sent completely? What would you do about it anyway?)

Okay, Mr. Smartypants, so how did *you* get stuck with this?

Believe it or not, I didn't get stuck with this because I (or rather, Luke) was writing an HTTP server. At least, not initially. We were actually testing WvHttpPool (an HTTP client that's part of WvStreams) for reliability, and thought we (actually Luke :)) would write a trivial HTTP server in perl; one that disconnects deliberately at random times to make sure we recover properly.

What we learned is that our crappy test HTTP server has to linger, and I don't mean SO_LINGER, or it totally doesn't work at all. WvStreams, it turned out, works fine as long as you do this.

Epilogue: Lighttpd lingers incorrectly

The bad news is that the reason we were doing all this testing is that the WvStreams http client would fail in some obscure cases. It turns out this is the fault of lighttpd, not WvStreams.

lighttpd 1.4.19 (the version in Debian Lenny) implements its lingering incorrectly. So does the current latest version, lighttpd 1.4.23.

lighttpd implements Apache-style lingering, as it should. Unfortunately it stops lingering as soon as ioctl(FIONREAD) returns zero, which is wrong; that happens when the local socket buffer is empty, but it doesn't guarantee the remote end has finished sending yet. There might be another packet just waiting to arrive a microsecond later, and when it does, blam: RST.

Unfortunately, once I had debugged that, I found out that they actually forgot to linger at all except in case of actual errors. If it's just disconnecting you because you've made too many requests, it doesn't work, and kaboom.

And once I had debugged that, I found out that it sets the linger timeout to only one second. It should be more like 120 seconds, according to the RFCs, though apparently most OSes use about 30 seconds. Gee.

I guess I'll send in a patch.

Footnote

1 Oh, I promised you an answer about the Quit command. Here it is: I'm pretty sure shutdown() was invented long after close(), so the only way to ensure a safe close was to negotiate the shutdown politely at the application layer. If the server never disconnects until it's been asked to Quit, and a client never sends anything after the Quit request, you never run into any of these, ahem, "lingering problems."

Syndicated 2009-08-14 03:59:53 from apenwarr - Business is Programming

518 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!