# Older blog entries for slamb (starting at number 46)

7 Nov 2006 (updated 8 Nov 2006 at 02:40 UTC) »
ncm

You were in Boston before, right? Welcome to the Bay Area...

Thank you! In response to your solving one of my long-standing complaints about the recentlog, I'll try to post something interesting to it.

Background: I've been slowly playing with a suite of software for looking at performance. It's one of those problems that I think few people look at, because even though there's real computer science involved, any project relating to the word "test" is sneered off by most people as QA work. (Another project I've been playing with is a monitoring system for sysadmins; that's basically the same way. Sysadmin tools suck.) It's possible there's a little more computer science involved than necessary because lately I've been wanting to do a little less software engineering and a little more computer science...

No release, documentation, or even a name yet, but there's a little bit of code in my Subversion repository here.

Today's problem: I made a graph of transaction duration vs. time of a quick Apache load test.

It's worthless. You have basically no idea how long the average transaction takes because the scale is blown by a few outliers. I tried a couple obvious things to solve this:

• switching to log scale
• setting ymax to be some function of the mean and standard deviation

but wasn't happy with the result. What I really wanted was what network people do for billing: look at the 95th percentile. The obvious way to do that is to sort the data and pick element n/.95. But my program works with huge data sets - this graph was generated from 1.3 million transactions. Everything else uses (nearly) linear time and not much disk. I don't want to sort stuff.

Off to the ACM Digital Library:

These are really cool algorithms to estimate quantiles like the median or 95th percentile. The first has no defined error bounds but uses O(1) space and time. The second returns a value whose rank r' is within [r-εN, r +εN] (with ε chosen in advance). It runs in a single pass using O(1/ε log (εN)) space. Neither requires you to know N ahead of time.

I implemented the first one. A really simple test:

```
def testShuffledUniformSequence(self):
r = random.Random()
e = QuantileEstimator(0.5)
numbers = range(100)
r.shuffle(numbers)
for i in numbers:
e.append(i)
self.assertTrue(40 < e.get() < 60)
```

Though there aren't well-defined error bounds, it seems to do quite well on my data set. That range of 40 to 60 in the test case is crazy liberal; I just wanted to catch my blatant implementation errors.

I fed that into my graph tool to bound the graph at five times the 95%.

Much better.

Now I have a general-purpose statistics class I can use all over the place. Maybe I'll plug it into Axamol SQL Library's SQL statement timing to get the median and 95%.

23 Sep 2006 (updated 23 Sep 2006 at 16:39 UTC) »
redi, lloydwood:

I'd go further than redi. look at one of my projects, sigsafe. It's not just quiet; it's inactive. It's never been exactly overflowing with users. And wow, apparently it's been over two years since I've made a release. And there are things I should do to it:

• a bugfix release - Marcin Kowalczyk pointed out a few problems, which I've fixed in SVN.
• more documentation - I have way more than a lot of projects, but there are a bunch more areas I'd intended to fill out as well.
• new platforms - Darwin/x86 and Darwin/x86_64 are suddenly quite popular. I've got some half-finished code for them in my working copy, but sadly no machine to try it on.
• maybe use Linux's newer system call mechanism; it's a little faster.
• maybe port the race condition checker to other platforms.
• maybe move all of the documentation to a trac-based wiki so it's not dependent on my updating it
• maybe set up a buildbot for continuous integration against all those platforms.

In its current state, is it dead? And since it's at the 89.44 percentile, so are all the projects below it dead, too? Maybe. But the code I've released should still work as well on the very newest versions of all those operating systems as it did on the versions I wrote it for. And the documentation is perhaps more useful than the software itself - it points out the problem and introduces a few other techniques to solve it. I get a lot of hits on it (few of which go through sourceforge). I'd like to think that it's still providing a service to the world, even though I haven't touched it in so long.

Obligatory: long time since last entry, been busy, went to Africa, blah blah blah.

Django Pronouncement

Titus: that's interesting. Think anything will come of his hope that Django and TurboGears "converge"? TurboGears guy says no; the approaches are too fundamentally different.

I've been using Django, unaware of TurboGear's existence. (Though I was using MochiKit, which is apparently part of it. Still, I missed the memo.) I love Django as a whole, but I hate their ORM. I want to be able to actually use the database properly, which means doing stuff like this:

```    cursor.execute("update blah blah")
if cursor.rowcount == expected:
conn.commit()
return HttpResponseRedirect('/') # back to here as a GET
else:
conn.rollback()
return HttpResponse(tConflict.render(...)) # or whatever
```

Here I'm doing optimistic locking. I also use transactions to add tightly interrelated data at once; a lot of software which handles your primary keys for you doesn't do this well. I execute non-trivial joins. I tend to hand-code my SQL for this reason. But ORMs are trendy and everyone wants one. SQLAlchemy's website at least tells me that they understand my complaints. Their feature list includes:

• SQLAlchemy doesn't view databases as just collections of tables; it sees them as relational algebra engines.
• Eager-load a graph of objects and their dependencies via joins
• Commit entire graphs of object changes in one step
• You can use the connection pool by itself and deal with raw connections

ORMs which don't have those features are useless to me. Think Django will switch?

5 Sep 2005 (updated 6 Sep 2005 at 20:43 UTC) »
Life

In the time since I last posted, I quit my job, moved to Santa Clara, California, applied for jobs, and started working at 2wire. (Yes, in that order.) New surroundings, new challenges. I like it.

Jython

I've been playing around with Jython. I've discovered it's incredibly useful for:

• making small scripts that use Java APIs. For example, I made a bunch of scripts to do IMAP operations that Mail.app and Thunderbird don't support. It was much more pleasant doing this with JavaMail than with Python's IMAP API, and much more pleasant doing this with Python than with Java. It was much nicer to pass around functions than it would have been to write inner classes for all that stuff.
• experimenting on large Java systems. Reproducing bugs, doing benchmarks.

It's not obvious from the Jython webpage, but Jython is under active development again. If you haven't seen 2.2a1, download it and play with it. It's buggy, but I'm impressed. Generators, some integration with the Java collections classes, bug fixes, etc.

SSL proxy

After SSL performance problems at work in Java, I benchmarked a few different SSL proxy implementations. I wrote a crude distributed SSL load tester. It connects, handshakes, sends 4KiB, receives 4KiB, and disconnects. I found that:

• my SSLProxy.java could do 7 transactions/sec. I wrote this in about 15 minutes. It uses two threads per session: one reads from the client and sends to the server; the other does the opposite. This is how you have to do things in Java 1.4 or below, since they only made the SSL engine work with the non-blocking IO API in Java 1.5.
• stunnel could do about 110 transactions/sec. stunnel uses one process per connection. Each process does non-blocking IO to handle both client->server and server->client in one execution context.
• my async_ssl_proxy can do 240 transactions/sec, I think. (I had the wrong hardware to test it properly. I need the tester machine(s) to be significantly faster than the testee, since my test client is process-per-connection.) This is a libevent-based server that uses one process for all connections. Don't start deleting your stunnel installations, though - it's seriously lacking in polish.

I'm not sure why the stunnel people chose to use a process for every session. It's not any simpler than my design, since they're already using non-blocking IO. And apparently it's a lot slower.

The Java version's performance is too bad to be explained solely by its threading model. Apparently the Java SSL library just sucks. This was the cause of the problems at work.

socket tests

In the process of writing async_ssl_proxy, I discovered that I don't understand UNIX sockets as well as I'd thought. The problem is that the standards don't specify much. I found some websites discussing behavior, but they were too vague. "Some systems, under some circumstances..." What systems? What circumstances?

I want to know if these problems are relevant to me - if modern systems have these problems. If someone tells me "your program doesn't work on Domain/OS SR10.0", I'll say "Here's a quarter. Go buy yourself a Linux system." But if it happens on my shiny OS X laptop, I'll work around it.

I also want to actually see these problems in action. If I can't see the weird behavior, how do I know that I've accounted for it properly?

I'm writing experiments to fix this. They're Python unittest scripts designed to make all of the corner cases happen consistently. I'm simulating network failures by manipulating firewall rules during each test.

close_tests is still incomplete. I'd like to go from a failure after any socket operation to a real reason like ECONNRESET or ENETDOWN. My program's behavior wouldn't change, but I like to pass the underlying cause on to the user. Some operations like shutdown give bizarre errors like EINVAL on OS X. But I'm making progress.

4 Feb 2005 (updated 5 Feb 2005 at 18:04 UTC) »

In the process of getting ready to apply for jobs, I've been polishing my projects' webpages, bundling releases, and creating some documentation. I just submitted to freshmeat:

• NetGrowler - (OS X only) displays pop-up notifications of networks events (joining new wireless networks, IP address changes, etc.) through Growl.
• Axamol SQL Library - executes SQL statements stored in external library files from Java code, with named parameters. Separating SQL and Java code increases readability, eases maintenance, and allows separate testing and documentation.
3 Jan 2005 (updated 16 Oct 2006 at 23:53 UTC) »
Life

...goes well. I graduated from the University of Iowa last month with a B.S. in computer science and minor in physics. I want to stay in Iowa City through the summer, so I'm getting a local job. I'll likely just step up to a professional position at the hospital, since the job listings I've found sound horribly boring.

Microsoft

...sucks. Windows ate my Linux partition! I just found this KnowledgeBase article. It says that if you have more than one "primary" partition[*] and select any but the first to install to, Windows XP must also reformat all partitions before it. That's bad enough, but what the article doesn't say is that there are no prompts or warnings about this. It gave me the usual reformatting warning, but everything lead me to believe it was talking about its little area. When I booted up the Fedora rescue disk to put grub back (as Windows has always silently overwritten the MBR), I was shocked to discover that my Linux partition's type had changed to 0x07 (HPFS/NTFS) and e2fsck found nothing there. No valid magic number at the superblock or any of the backups. It wasn't just unbootable, it was gone.

Just to add insult to injury, it didn't install Windows XP properly, either. It saw the existing Windows 2000 partition on the primary slave (older) hard disk and used that as the "boot" volume (the one containing the NTLDR), again without asking me. In addition to making my drive letters quite weird (C: refers to the second drive and H: refers to the first drive), this means it doesn't boot when I remove the second drive. I need to do some trick in the recovery console to make it work.

If a Linux distribution had this bug, people would raise hell. How does Microsoft get away with this shit?

[*] - The partition table supports up to four partitions, which can be either "primary" or "extended". "Primary" partitions are the simple kind; "extended" ones are useless by themselves but can contain "logical" partitions. I think you should only have one extended partition. Only one partition should be "active" (the one the simple Windows MBR will try to boot) and it must be primary. Though it's actually stored as a boolean in each entry, IIRC, so you could have zero or more active. Who knows what Windows would do then. What a stupid scheme this is.

The actual partition layout:

```
Disk /dev/hda: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot      Start         End      Blocks   Id  System
/dev/hda1               1        9729    78148161    5  Extended
/dev/hda2   *        9730       19457    78140160    7  HPFS/NTFS
/dev/hda5               1         125     1003999+  82  Linux swap / Solaris
/dev/hda6             126        9729    77144098+  83  Linux
```
18 Dec 2004 (updated 20 Dec 2004 at 02:41 UTC) »
.NET seminar

I attended a .NET seminar for U of I developers today. Jeff Brand was the speaker.

First half: Worthless. Marketing. He acted apologetic, but he still spoke the jargon like a native marketroid. He even brought up the pet store example. While he did admit the line counts were questionable, he really drove in the number of hours spent on tuning the stupid, convoluted EJB version. He didn't mention the good Java implementation. And he made the strange assertion that Microsoft's VM is better because it uses JIT. He hedged a little, saying there are implementation differences on the Java side, but he implied that JVMs do not use JIT. (He might have simply deceived by saying the Microsoft JIT-compiles everything for near-native speed. Sun is smarter than that; their hotspot JVM doesn't bother JITing infrequently used code.) I wanted to ask him to name a single major JVM that does not JIT...but I remembered the cardinal rule of seminars: never heckle before they feed you.

Lunch: A disappointment. Microsoft didn't understand how much food we eat. And it was almost all older, full-time developers - I wonder what would have happened if more student/developers showed. We would have eaten his laptop. They ordered more pizza, but it did me little good - had to jet right afterward. Maybe I should have heckled after all.

Second half: Very interesting. He focused on web development with real demos. The controls they have save a lot of work (even the simple ones - a database-driven table that sorts based on clicking the column names and pages, in a themeable way), and the development tools are unbelievable. Eclipse is nice, but after what I saw today, I don't think it's at the same level. I'm sure something similar to the controls they showed exists in the Java world; I will have to find it and design AXP taglibs to interface with it. Maybe JavaServer Faces; I've put off investigating them for too long.

I wouldn't want to use the .NET functionality as-is, though. You'd lose all the advantages if you go a little off their path. For example, the post-back functionality he described as coming in .NET 2.0 is IE-only. No good technical reason why; Google can pull off similar things on any modern browser. This is where open source shines; an OSS framework could do the same partial implementation and I wouldn't care because it'd be easy to hack the source myself.

Update: Jeff Brand just emailed me. They do support the post-back functionality in other browsers. He was thinking of early alpha releases when he said it was IE-only.

SQL query tools

Oracle's horrible SQL*Plus Worksheet made me think about what makes a good SQL query tool. (And by this I mean the kind that just executes SQL statements you give it (like Oracle SQL*Plus), not the kind that comes up with SQL statements based on your selecting tables from a list, drawing joins between them, and selecting columns to return (like Oracle Query Builder).)

I made up a list of things the perfect SQL query tool should do. I'm thinking about writing my own, but first assessing what's already out there. Here are a few of the better tools I've found:

• SQuirreL SQL allows you to have multiple tabs open, displays the results in a decent graphical table, and - very importantly - does not block the GUI while a query is running. In fact, it even has a "cancel" button. (It even works! I'm used to Oracle Query Builder's cancel button, which just exists to toy with you.) So if you realize as soon as you start the query that you forgot a join condition, you can immediately cancel and restart. But I hate many other things about its UI, like that it uses windowed MDI at the top level. It's got a lot of UI/core separation, so I should look again at making an alternate UI for it.
• JFaceDbc, which apparently uses some of SQuirreL's back-end code, is way out there. It's an Eclipse plugin. Like Eclipse, it does some amazing things that I didn't realize I wanted until I saw them. I guess it's not really primarily a query tool, since their screenshots don't actually show the results of running a query. It's more of an IDE for SQL. It has a SQL editor that goes way beyond syntax highlighting. In true Eclipse fashion, it has the hovers that show you table definitions, auto-completion of table names, etc. I need to play with this one some.
• DBTree. Its workbook output format (the right pane of this screenshot is the closest thing to what I'm envisioning. Looks beautiful. Seems to be closed-source shareware, though.

I'm not sure yet if I'm going to do any work on this. And if I do, I'm not sure if it will be extending one of those tools or starting from scratch. I might at least make some mock-ups of what my perfect tool would look like. I will soon be entering the Real World, so it would be nice to have this in my portfolio to show I can do UI design. Maybe I'll actually write the thing after that, or who knows...maybe someone else will be interested.

Axamol

...is the new name for my xmldb, framework, and mb projects. It's pronouncable, and it has the letters 'xml' in it - all three of these subprojects involve XML.

My choice now is if I should release xmldb/framework together or separately. I might release them as "Axamol SQL Library" and "Axamol SAX Pipeline". Or together as "Axamol". I wrote up pros and cons. If anyone feels like reading them, I'd like opinions.

Axamol SQL Library

Recently got support for some more dynamic SQL. Jeff started using it for his SQL Logger and ran into a problem. He had some conditionally-included where clauses in his Java-generated SQL. He tried switching it to something like this:

```<s:query name="foo">
<s:param name="mindate" type="date"/>
<s:param name="maxdate" type="date"/>
<s:sql databases="pgsql">
select    *
from      table
where     (<s:param name="mindate"/> is null or date >= <s:param name="mindate"/>)
and     (<s:param name="maxdate"/> is null or date &lt;= <s:param name="maxdate"/>)
</s:sql>
</s:query>
```

...and found that it took 4 seconds where it used to take 15 ms. PostgreSQL was coming up with a query plan designed to work in either case, so it wasn't using his indexes. (Makes me wonder if our Oracle Reports at work should be using lexical bind variables for this same performance reason.)

So I introduced another form of dynamic query:

```<s:query name="foo">
<s:param name="mindate" type="date"/>
<s:param name="maxdate" type="date"/>
<s:sql databases="pgsql">
select    *
from      table
where     true
<s:ifNotNull param="mindate">
and     date >= <s:param name="mindate"/>
</s:ifNotNull>
<s:ifNotNull param="maxdate">
and     date &lt;= <s:param name="maxdate"/>
</s:ifNotNull>
</s:sql>
</s:query>
```

Which just conditionally includes a SQL fragment based on whether a parameter is null. Simple but effective.

I also introduced dynamic order by clauses, also for his code. Nothing too exciting there.

He also had a list of regexps to require, but my existing <s:bindlist> dynamic SQL was general enough to handle that. I think he ended up with <s:bindlist join="" each=" and field ~* ?" param="regexps"/> or similar.

I'm not sure how much more dynamic SQL people will need. I could introduce a way to dynamically insert SQL identifiers, but hopefully that's rare to do in software. I've sometimes written queries like this:

```declare
cursor grants is
select    grantee, granted_role
from      dba_role_privs
where     granted_role in ('FOO', 'BAR');
begin
for grant in grants loop
execute immediate 'revoke ' || quote_identifier(grant.granted_role)
|| ' from ' || quote_identifier(grantee);
end loop;
end;
/
show errors
```

...but always as ad-hoc queries. I can't think of a reason for software to do that.

Nevertheless, I'm sure as soon as I release this code someone will ask for some form of dynamic query I haven't anticipated. People make SQL for some quite exotic stuff.

Axamol SAX Pipeline

Work on my JSP-like format for building SAX streams is going well. (Used to be .xfp. Called .axp for now, though that has an unfortunate similarity to .asp when spoken.)

I finally ditched "logicsheets", which was a stupid idea I'd gotten from Apache Cocoon. (Preprocessing .axps with XSLT to provide reusable tags.) Instead, I managed to adapt JSP's taglib idea to SAX streams. It worked out well - I even managed to derive a AxpPageContext from javax.servlet.jsp.taglib.PageContext. It has most of the same methods but makes fun of you if you try to get a JspWriter from it. It should make porting JSP taglibs easy. In particular, Struts 1.2 seems to want a PageContext to do anything, and now I have one to give it.

Multi-language support is coming. For now it supports java and java-el (Java + JSP EL expression languages). The second one only required 150 lines of code based on Apache Commons EL. (Inheritance is wonderful.) Now I'm working on PythonAxpWriter, using Jython.

I also cleaned up a lot of code and exception handling. Plus making some things just more pleasant, like the way the Ant task to compile .axps now considers them out of date if the compiler itself has changed.

AxpCompiler.Handler, the language-independent bulk of the .axp compiler, continues to defy my efforts to clean it up. It's a huge mess, and surely buggy.

14 Nov 2004 (updated 14 Nov 2004 at 06:35 UTC) »
jpick:

The problem you've mentioned, difficulty getting both parallelized builds and proper dependencies, is because you have "all the Makefiles". It's a recursive make problem; if you had a single instance of make able to see the entire dependency graph, a correct parallelized build would be trivial. Have you read Recursive Make Considered Harmful?

The catch is that it's hard to set up a single Makefile for an entire build; it's different than how anyone does it, so there's no good documentation. Tools like automake only sort of support it. You can include dir.mk files in each directory to have somewhat similar organization, but it gets...weird.

What's more realistic is switching to a make/autoconf/automake alternative like SCons. Everyone who writes an SConstruct (the equivalent of Makefile) uses it to make a single dependency tree of the entire project. It supports this model well by allowing inclusion of SConscript files in subdirectories. These are somewhat like having Makefiles at each level; it's more than just preprocessor-style inclusion. (They have separate scope, though you can import/export variables.) But unlike many Makefiles, it yields a single dependency graph.

Plus, I like SCons better anyway. Clean, uniform syntax (Python!), rather than m4 over make over sh. Built-in help checking for dependencies in C files. The ability to use a real scripting language to self-generate without introducing another new syntax. Less platform variation, since Python provides a more uniform API than the shell utilities. (No need to use portability m4 macros.)

37 older entries...