Just noticed that the ICFP 2002 programming contest now has a website up. It starts on August 30, so it's probably not too soon to make sure your favourite language is installed on the contest machine.
luke@cockatoo:~$ ps aux | grep xterm luke 18794 0.0 0.2 1336 428 pts/6 R 01:19 0:00 grep xterm luke@cockatoo:~$
By switching from xterms to eterms, my dear laptop is now populately solely by Emacs windows - a tremendous personal achievement! It results directly from the breakage of the "mouse button" on my Thinkpad a few months ago, which taught me that a mouse is a lousy substitute for some Emacs and Sawfish hacking :-)
(I still use a mouse for Netscape on my desktop machine at work, w3m not being so universal. One step at a time..)
Actually it's two topics, I think. One is how well TCP works, particularly in some specific situations like "well provisioned" high-speed LANs. The other is about the merits transport-layer framing, in other words having TCP (or a similar protocol) track discrete "frames" rather than just a continuous stream of bytes.
For the framing issue, I think there must be some specific examples where it's useful that haven't been mentioned. Grit: why do you want the correspondence between write()s and TCP segments? What am I missing?
Here's my reply to the "TCP Apologists Considered Annoying" article (anyone's welcome to respond to this):
By pay-as-you-use, I mean that most congestion-control features are only actually used when you encounter congestion. Or more accurately, when you encounter packet loss, severe packet reordering, or large "spikes" in delay, which TCP interprets as signs of congestion.
On a fast and reliable "specially provisioned" network, I'm assuming these things are extremely rare. That being the case, I don't see why TCP congestion control should cause any problems.
It would be more interesting to know what mistakes are showing up in packet traces, and whether they're caused by TCP implementation bugs or by network quirks that the protocol intrinsically doesn't handle well. I'd want to determine this before concluding that TCP's congestion control is expensive, and certainly before turning it off or designing alternatives.
This is just a disagreement over names. I was refering to boxes that both switch ethernet packets and work with the higher level protocols, for instance to use HTTP cookies for session persistence in load balancing. Vendors and industry press call them "layer 7 switches". I did originally put the name in quotes, after all :-)
That hits the nail on the head. The reason that TCP proxies can do so much transparently is that TCP and SOCK_STREAM leave so much freedom to the transport, compared with e.g. a datagram protocol where application level frames must correspond to IP packets. With streams, write() isn't defining a frame, it's just writing the next sequence of bytes in the stream, so there is no frame information to be preserved.
Apparently I am missing the advantages, but then they haven't been stated specifically. I can't tell what's being proposed from the text above, and TCP streams already have most of the listed advantages.
Streams give tremendous flexibility to the transport layer: its only restriction is to ultimately deliver the bytes in order. Any way it wants to chop them up into network-layer packets for better flow-control, retransmission, transfer efficiency, or to fit the receiver's buffer is no problem. It can even split an application-level frame header into pieces if that will be more efficent - it's just stream data like any other. Intermediate proxies are able to re-package the data when they need to, e.g. because their MTU is different on either side or because they want to modify the data. Even the sender and receiver can repackage the data, e.g. with nagle's algorithm or to compress the send/receive queue.
It's also easy to put application frames on top of streams. Writing and reading a 2- or 4- byte in-band length header is cheap and simple, and so are other framing formats like HTTP 1.1 chunked encoding. Taking care of byte ordering is trivial, and often necessary for other things than framing anyway.
In summary, streams support simple and cheap application-level framing with great flexibility to the lower layers, in addition to unframed byte streams. So what significant problem would transport-level framing for a TCP-like protocol be solving, and what specific scheme might it use to do so?
This is a really important point. It's one thing to respect layering, but to completely ignore the existence of lower layers could lead to very inefficient code. It pays to understand what's likely to happen under the hood and give the right "hints", in much the same way it's useful to understand what compilers are going to generate from your programs.
So for instance, if a program is doing a bunch of separate write() calls, this might cause the data to go out in separate IP packets and ethernet frames, when it could have all fit into one. Thoughful use of writev(3) and TCP_CORK could help out for things like this. It sounds like Jeff from Platypus has a lot of experience in this area -- I don't.
The TCP Considered Annoying article (recently posted on raph's diary) is really interesting, but I don't think its title or conclusions are really justified. The specific criticisms are mostly for missing datagram-related feeping creatures rather than for TCP doing a bad job of its intended purpose: providing reliable streams over general internetworks. Particularly, I don't think it was established that "[of TCP's features] some almost always good, some of less certain value, and at least one outright bad", or that the desired extra features are worth adding to TCP.
I realise the author is a smart and informed guy, and that the article wasn't intended as "scientific writing", but I'd like to criticise none the less. This is partly to defend TCP's good name, partly to start a conversation by putting my head on the block, and partly to show off that I've read a book about networks ;-)
First off, there's not much against TCP as a reliable stream protocol for the internet. For people doing stream-based internet programs, TCP's doing a good job for them and they have no need to be "annoyed" with it or to invent their own hopefully-extra-efficient protocols based on UDP. The article doesn't suggest this, but I think it's a fairly common notion (y'know, for when it needs to be really efficient..)
For a specially provisioned network that won't lose packets or become congested, TCP's still going to do a pretty good job. You might like to disable nagle's algorithm to reduce latency, and perhaps tune your buffer sizes, but that's straight forward enough. The congestion-control features are mostly pay-as-you-use, the only thing that should affect you on such a network is slow-start. For initiating new connections to do batch transfers, TCP will give you some overhead - I estimate less than 2ms of total idle time on my local fast ethernet. So setting up new connections has some expense, but if you have a lot of data to transfer and you reuse the connections, this should disappear.
For the complaint that TCP hides packet boundaries, and that lower-layer IP packet shapes could be used instead of in-band length headers, I think a whole can of worms would be opened because this breaks the layering of the protocol stack. Specifically, convenient and useful things that (transparent) proxy servers and "Layer-7" switches can do today could break such a TCP by "reshaping" the packets (which can be useful, and is safe for a stream protocol). What about a content-transforming proxy server, or a generic TCP proxy with different path MTUs on either side? And what if the packets are larger than the MTU - use IP fragmentation? Programs like that would be more awkward to write and prone to subtle errors.
That added complexity doesn't seem worthwhile just to avoid writing some tiny in-band length headers, and would only partially satisfy people who want reliable datagrams, anyway.
Or maybe the real reason to be annoyed with TCP is the "worse is better" sense: it's so good for 90% of the cases that not enough people throw their weight behind things like reliable datagram protocols or SCTP.
Which raises the awkward thought: I don't print/read/edit my programs that way. As an experiment I've been going through the loop a few times with some of my source code, and I think it makes a really solid improvement. I guess this is old news to good programmers, but I'm not usually in the habbit of sitting down and reading my own programs start-to-end.
If I were a real man I'd ask some colleagues to point out my worst code, and print/read/edit that. But I'm not sure I'm that brave right now :-)
All that said, the manual is probably full of typos, and the programs full of bugs - but such is life!
Today I added some basic builtin help functions to Ermacs. As well the as Emacsey C-h k <key sequence> `describe-key', I have a C-h s <key sequence> `find-source', which takes takes you to the source code that's bound to that key. So if you press a key and it doesn't do what you want, you're only about 1sec away from the source code. I think this is a good feature for an under-development editor :-)
Emacs still has the upper hand though, since it takes a bit more than a keystroke to load new definitions into Ermacs, for now..
Welcome back to the "real world", purcell :-)
New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!