What do the words "async i/o" mean to you?
What type of i/o do you think of if someone told you a smtp server does async i/o?
It's been a while since I've written. For those trying to reduce Spam, check out the company Habeas. It uses copyright law to fight spam. Good stuff. Basically it adds a copyrighted haiku header to your messages if you say you are not a spammer. Others can use those headers to filter good mail into one pile. Basically a white list. A different approach than those trying to identify spam, it simply identifies good mail instead.
This weekend I wrote a device driver for linux. It's very simple and doesn't actually talk to any devices. It simply increments a global variable by one every time it is read from. The device returns 2 integers. One integer is a generation, the other a sequence. The generation part should be incremented every time the module is loaded, while the sequence integer is incremented every time the device is read from. There are 2 devices one can read from. One device is meant for C programs that read the data into an integer. It will never return an 'EOF'. Therefore if you want 10 sequences, just read 10 x (2 * sizeof(int)). Then close it. The other device is meant for scripts and returns and ascii string in the form generation:sequence. It returns 'EOF' when just one sequence is read. Writing to the device sets the device to whatever value you write to it.
With axehind's prodding, I've been looking into openMosix. It's is an extremely simple system to setup. He wants to update the userland tools. I didn't want to start that till I understood the oMFS file system. Now that I do (after many questions to the list) I should be able to start some work on it. First thing is to get the kernel-headers RPM working properly.
Work
After much research, I've decided not to use Apache 2.0 as the basis for my next SMTP server. I'm convinced that async i/o is the way to go. Thanks raph for mentioning that topic in your diary. My SMTP server system will use ReiserFS for a filesystem and perhaps
/dev/epoll for event notification. I'm not worried about portability yet. I more interested in raw performance and /dev/epoll seems the best way to go. I've requested that this be open source so others can join in the developement. I'll be posting snippets either way.
If any Advogato users are bird watchers, I have a bird watching site based on Advogato code. It's at www.migratus.com. There are some user interface issues that I need to work on, but any feedback would be nice.
It's been a while since I've posted. I'm ready to write another version of the smtp server I wrote for work, but I would like to use Apache 2.x as the underlying system. I haven't been able to convince work that this could should be an open source project. I'm the only hardcore C programmer here, so having others look at the code would be of tremendous help.
I've been playing with mod_virgule code. StevenRainwater has been very responsive about my needs. I even submitted a patch for a bug, but Steven found that the bug is much deeper and has posted an even bigger patch. I'm playing with this code too so I can use it for a birding site that I've been trying to do for years. I would be cool if anything could be rated (ie diary entries, articles, etc). However, it seems like that is going to take a while to do.
I bought 'The GNU C Library Reference Manual' 2 weeks ago and did a fast read of it. I like how glibc2 added dynamic buffer allocations for things like sprintf (ie asprintf). Also of interest was the hash and tree functions as well as argp (a getopt alternative). There are a bunch of useful functions in glibc2 that my linux system doesn't have man pages for, so the books (it is 2 volumes) are going to be very useful. Oh, one thing that interested me very much was that all the stream functions (ie fprint etc) do intristic locking so that those functions are thread safe. It seems you get that even if you aren't writing a threaded program. There are corresponding _unlocked functions if you don't want locking.
Seems like I only post when I need help!
I've been running my SMTP server for quite a while now (a
month) and I've noticed the following behaviour. Every now
and then a client closes the socket during the DATA phase. I
catch that and discard the message. However, after
re-reading the RFC, I get 2 conflicting ways to handle
this:
one part of the rfc states: 4.1.1.5 RESET (RSET)
... There are circumstances, contrary to the intent of this
specification, in which an SMTP server may receive an
indication that
the underlying TCP connection has been closed or reset. To
preserve
the robustness of the mail system, SMTP servers SHOULD be
prepared
for this condition and SHOULD treat it as if a QUIT had been
received
before the connection disappeared.
ok, that says treat it as though a quit happened, which just
doesn't sound
right to me. If DATA was issued and I'm waiting to see a
CRLF.CRLF, and
the client closes the connection, I would think that it
would be better
to assume the entire message hasn't been sent.
however later I see:
4.1.1.10 QUIT (QUIT)
... If the connection is closed prematurely due to
violations
of the above or system or network failure, the server MUST
cancel any
pending transaction, but not undo any previously completed
transaction, and generally MUST act as if the command or
transaction
in progress had received a temporary error (i.e., a 4yz
response).
which doesn't quite make sense to me because basically it
sounds like this:
DATA -> oops! -> back to MAIL state -> return a 4xx
code
but wait, the socket is closed!
So does anybody have some words of wisdom?
I'm thinking the correct behaviour is to treat it as the
following:
client closed socket
server treats it as a RSET and QUIT.
I'm reachable at jeff @ virtualbuilder dot com
I'm moving on up!
Been a while since I've posted. Basically I've been optimizing the code. I've replaced several write() calls with writev, trying to minimize system calls as much as possible. I'm now starting on better RFC conformance. For example I must accept postmaster without a domain for a recipient. Also been thinking of how to get managment to let me opensource the code.
I signed up to a bunch of yahoo lists to see how the MTA is. So far no memory leaks. I've had a few timeouts, so I now change the process title for each SMTP phase, much like sendmail does. I also had 60k messages sent to it, and it handled them in under an hour (~16/sec) with 50 children. Thing is most of the children were idle. 4 machines running sendmail fed it. I was expecting the machine to be overloaded, but it didn't even burp. The test machine is an old ibm pentium II 350MHz with 256MB ram.
Thanks to all who responded to my request for help. Looks like I need to learn how to read. My smtp server now works with sendmail. Sendmail has this weird loop detection based on the 2nd word of the response to helo/ehlo. I've fixed the code and it works. Now for some more testing.
Ok, I am now testing MTA interaction with my smtpd server.
Sendmail barfs. Here is what I posted to comp.mail.sendmail:
Hi,
I've written a custom MTA and I testing interaction with it
from other MTA's.
It seems sendmail (8.11.6 and whatever AOL is using) doesn't
like my MTA,
but Yahoo and Exim does. Here's a smtp log:
/usr/lib/sendmail -v jeff.test@somehost.e-dialog.com Subject: testing sendmailThis is a test from sendmail . jeff.test@somehost.e-dialog.com... Connecting to somehost.e-dialog.com. via esmtp... 220 e-dialog smtpd server $Id: smtp.c,v 1.14 2001/12/18 19:56:38 me Exp $ >>> EHLO server1.somedomain.com 250 server1.somedomain.com Hello, how are you? somehost.e-dialog.com. config error: mail loops back to me (MX problem?) >>> QUIT 250 Good bye jeff.test@somehost.e-dialog.com... Local configuration error /home/jeff/dead.letter... Saved message in /home/jeff/dead.letter Closing connection to somehost.e-dialog.com.
New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!