5 Sep 2005
(updated 6 Sep 2005 at 20:43 UTC) »
In the time since I last posted, I quit my job, moved to Santa Clara, California, applied for jobs, and started working at 2wire. (Yes, in that order.) New surroundings, new challenges. I like it.
I've been playing around with Jython. I've discovered it's incredibly useful for:
- making small scripts that use Java APIs. For example, I made a bunch of scripts to do IMAP operations that Mail.app and Thunderbird don't support. It was much more pleasant doing this with JavaMail than with Python's IMAP API, and much more pleasant doing this with Python than with Java. It was much nicer to pass around functions than it would have been to write inner classes for all that stuff.
- experimenting on large Java systems. Reproducing bugs, doing benchmarks.
It's not obvious from the Jython webpage, but Jython is under active development again. If you haven't seen 2.2a1, download it and play with it. It's buggy, but I'm impressed. Generators, some integration with the Java collections classes, bug fixes, etc.
After SSL performance problems at work in Java, I benchmarked a few different SSL proxy implementations. I wrote a crude distributed SSL load tester. It connects, handshakes, sends 4KiB, receives 4KiB, and disconnects. I found that:
- my SSLProxy.java could do 7 transactions/sec. I wrote this in about 15 minutes. It uses two threads per session: one reads from the client and sends to the server; the other does the opposite. This is how you have to do things in Java 1.4 or below, since they only made the SSL engine work with the non-blocking IO API in Java 1.5.
- stunnel could do about 110 transactions/sec. stunnel uses one process per connection. Each process does non-blocking IO to handle both client->server and server->client in one execution context.
- my async_ssl_proxy can do 240 transactions/sec, I think. (I had the wrong hardware to test it properly. I need the tester machine(s) to be significantly faster than the testee, since my test client is process-per-connection.) This is a libevent-based server that uses one process for all connections. Don't start deleting your stunnel installations, though - it's seriously lacking in polish.
I'm not sure why the stunnel people chose to use a process for every session. It's not any simpler than my design, since they're already using non-blocking IO. And apparently it's a lot slower.
The Java version's performance is too bad to be explained solely by its threading model. Apparently the Java SSL library just sucks. This was the cause of the problems at work.
In the process of writing async_ssl_proxy, I discovered that I don't understand UNIX sockets as well as I'd thought. The problem is that the standards don't specify much. I found some websites discussing behavior, but they were too vague. "Some systems, under some circumstances..." What systems? What circumstances?
I want to know if these problems are relevant to me - if modern systems have these problems. If someone tells me "your program doesn't work on Domain/OS SR10.0", I'll say "Here's a quarter. Go buy yourself a Linux system." But if it happens on my shiny OS X laptop, I'll work around it.
I also want to actually see these problems in action. If I can't see the weird behavior, how do I know that I've accounted for it properly?
I'm writing experiments to fix this. They're Python unittest scripts designed to make all of the corner cases happen consistently. I'm simulating network failures by manipulating firewall rules during each test.
close_tests is still incomplete. I'd like to go from a failure after any socket operation to a real reason like ECONNRESET or ENETDOWN. My program's behavior wouldn't change, but I like to pass the underlying cause on to the user. Some operations like shutdown give bizarre errors like EINVAL on OS X. But I'm making progress.