Older blog entries for remle (starting at number 11)

18 Jun 2002 (updated 18 Jun 2002 at 17:58 UTC) »
raph: Here's an idea. When someone uses a person tag, write a recent-log into the person's directory with the content being the author's name. Then add a 'recent citations' to the home page that key's off the authenicated user.

If any Advogato users are bird watchers, I have a bird watching site based on Advogato code. It's at www.migratus.com. There are some user interface issues that I need to work on, but any feedback would be nice.

It's been a while since I've posted. I'm ready to write another version of the smtp server I wrote for work, but I would like to use Apache 2.x as the underlying system. I haven't been able to convince work that this could should be an open source project. I'm the only hardcore C programmer here, so having others look at the code would be of tremendous help.

I've been playing with mod_virgule code. StevenRainwater has been very responsive about my needs. I even submitted a patch for a bug, but Steven found that the bug is much deeper and has posted an even bigger patch. I'm playing with this code too so I can use it for a birding site that I've been trying to do for years. I would be cool if anything could be rated (ie diary entries, articles, etc). However, it seems like that is going to take a while to do.

I bought 'The GNU C Library Reference Manual' 2 weeks ago and did a fast read of it. I like how glibc2 added dynamic buffer allocations for things like sprintf (ie asprintf). Also of interest was the hash and tree functions as well as argp (a getopt alternative). There are a bunch of useful functions in glibc2 that my linux system doesn't have man pages for, so the books (it is 2 volumes) are going to be very useful. Oh, one thing that interested me very much was that all the stream functions (ie fprint etc) do intristic locking so that those functions are thread safe. It seems you get that even if you aren't writing a threaded program. There are corresponding _unlocked functions if you don't want locking.

19 Feb 2002 (updated 19 Feb 2002 at 20:45 UTC) »

Seems like I only post when I need help!

I've been running my SMTP server for quite a while now (a month) and I've noticed the following behaviour. Every now and then a client closes the socket during the DATA phase. I catch that and discard the message. However, after re-reading the RFC, I get 2 conflicting ways to handle this:

one part of the rfc states: 4.1.1.5 RESET (RSET)

... There are circumstances, contrary to the intent of this specification, in which an SMTP server may receive an indication that the underlying TCP connection has been closed or reset. To preserve the robustness of the mail system, SMTP servers SHOULD be prepared for this condition and SHOULD treat it as if a QUIT had been received before the connection disappeared.

ok, that says treat it as though a quit happened, which just doesn't sound right to me. If DATA was issued and I'm waiting to see a CRLF.CRLF, and the client closes the connection, I would think that it would be better to assume the entire message hasn't been sent.
however later I see:

4.1.1.10 QUIT (QUIT)
... If the connection is closed prematurely due to violations of the above or system or network failure, the server MUST cancel any pending transaction, but not undo any previously completed transaction, and generally MUST act as if the command or transaction in progress had received a temporary error (i.e., a 4yz response).

which doesn't quite make sense to me because basically it sounds like this:
DATA -> oops! -> back to MAIL state -> return a 4xx code but wait, the socket is closed!

So does anybody have some words of wisdom?

I'm thinking the correct behaviour is to treat it as the following: client closed socket server treats it as a RSET and QUIT.

I'm reachable at jeff @ virtualbuilder dot com

I'm moving on up!

Been a while since I've posted. Basically I've been optimizing the code. I've replaced several write() calls with writev, trying to minimize system calls as much as possible. I'm now starting on better RFC conformance. For example I must accept postmaster without a domain for a recipient. Also been thinking of how to get managment to let me opensource the code.

I signed up to a bunch of yahoo lists to see how the MTA is. So far no memory leaks. I've had a few timeouts, so I now change the process title for each SMTP phase, much like sendmail does. I also had 60k messages sent to it, and it handled them in under an hour (~16/sec) with 50 children. Thing is most of the children were idle. 4 machines running sendmail fed it. I was expecting the machine to be overloaded, but it didn't even burp. The test machine is an old ibm pentium II 350MHz with 256MB ram.

Thanks to all who responded to my request for help. Looks like I need to learn how to read. My smtp server now works with sendmail. Sendmail has this weird loop detection based on the 2nd word of the response to helo/ehlo. I've fixed the code and it works. Now for some more testing.

Ok, I am now testing MTA interaction with my smtpd server. Sendmail barfs. Here is what I posted to comp.mail.sendmail:
Hi,
I've written a custom MTA and I testing interaction with it from other MTA's.
It seems sendmail (8.11.6 and whatever AOL is using) doesn't like my MTA, but Yahoo and Exim does. Here's a smtp log:

/usr/lib/sendmail -v jeff.test@somehost.e-dialog.com
Subject: testing sendmail

This is a test from sendmail . jeff.test@somehost.e-dialog.com... Connecting to somehost.e-dialog.com. via esmtp... 220 e-dialog smtpd server $Id: smtp.c,v 1.14 2001/12/18 19:56:38 me Exp $ >>> EHLO server1.somedomain.com 250 server1.somedomain.com Hello, how are you? somehost.e-dialog.com. config error: mail loops back to me (MX problem?) >>> QUIT 250 Good bye jeff.test@somehost.e-dialog.com... Local configuration error /home/jeff/dead.letter... Saved message in /home/jeff/dead.letter Closing connection to somehost.e-dialog.com.


I don't understand why I'm getting the MX error. That is printed out by sendmail, I'm not generating it in my SMTP server. The only thing I can think that may be tripping up sendmail is that I'm doing 3 writes, one for the 250, one for the server name, and one for the message "Hello, how are you?" when I send send a response to ehlo.
Thank in advance for any pointers.

Like the message states, any pointers would be great. You can send email to me at jeff+advogato@virtualbuilder.com.

It's been a while since I posted.

The smtpd server is coming along nicely. I've added a simple blocking list capability. It reads a file for bad internet addresses. The server now properly tell's it's children to shutdown, and the children will shut down once they are done with the current SMTP transaction.

I guess I should outline how this server works. It is very minimal and is meant for bulk processing. It's just a simple pre-forking server that understands SMTP. It writes all messages it receives to disk. It doesn't validate users. It accepts all mail it gets unless the source is in it's block list. Messages are written to disk and after that it returns OK. Prepended to each message are a couple of lines:

  • Forward-Path:
  • Return-Path:
  • Received:

Forward-Path reflects the value given to the RCPT command.

Anyhow, the server simply creates a new directory every minute and puts the message in a sub-directory of the current 'minute' directory. That sub-directory is the domain portion of the RCPT command. The message itself is in the form PIDSEQ. After the file is written, it's renamed PIDSEQ.msg. No locking required. This should also keep the number of message per directory manageable. I'm gambling that the server doesn't need to write more than 50 messages a second. So how does mail get to it's final destination? That's up to other programs.

jfleck: Personally that post did wonders for me. Why spend time agonizing about things when the code is going to evolve anyway. Some of my co- workers were horrified with it though. The other Unix guy besides me loved it. The others are Windows/ColdFusion weanies. :-)

The SMTP code is complete. I am now working on blocking access from certain hosts. Last week I started to research APR. I'm very interested in the pools, tables and arrays it provides. My code tries to do minimal mallocing by doubling array sizes. Link lists require a malloc for every item. I still need to work on the server code so that it will dynamically allocate/dealloate children. I also want to support DSO's to allow modules to particpate in any part of the STMP transaction. I'll add that feature later.

Well, I have the pre-forking server all set. I haven't made up my mind if I'll make it a true daemon. There's something elegant about just printing to stderr for logging. I'll have to look into qmail's daemons tools further. I've also started implementing SMTP into the server. I've done helo, and mail. For mail I've pretty much followed the RFCs. I notice that sendmail is a little bit more flexible in what it will handle.

I'll have to get back to the server code though, as I don't don't grow or shrink according to demand yet.

2 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!