Apparently this week is "let's find bugs in Titus's software" week. Didn't know it was formally defined... but three different people have poked holes in three different-but-related projects. The holes range from already-fixed-but-not-in-the-build (FRII), important-but-easy-to-fix (Cartwheel), and important-and-bloody-difficult-to-fix (paircomp). I have to say my users are really great: finding two of these bugs required great attention to detail. Thanks, guys!
The trickiest bug to fix involves finding transitive connections between three two-way comparisons (find all paths A-->B-->C such that for each path A-->B and B-->C and A-->C). I came up with a clever solution that was easy to understand and easy to implement in simple code; unfortunately, it falls apart in the face of reverse complementing. (As you may know, DNA is readable in two directions: AATTGGCC is equivalent to its reverse complement, GGCCAATT (complement: A <--> T, G <--> C).) This problem is compounded by the asinine data structure that I use to represent the matches. Looks like it's time for a serious refactoring...
All of these bugs remind me of this great quote from an interview with Damian Conway:
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." -- Brian Kernighan, via Damian Conway
I really enjoyed reading this Damian Conway interview on builderau.com. This is a man who has done it all, and has sound advice based on experience. He also gives an excellent reason for using Perl: it's an immensely powerful language that lets you do pretty much whatever you want. (I don't think it's a good idea for inexperienced programmers to use Perl for anything more than short scripts, but -- like Python -- I suspect "short scripts" describes 95% of what is done with Perl ;).
In other news, my OCaml adventures proceed apace. I just finished my very first OCaml program (temp link). dd2.ml implements a simple recursive global-alignment algorithm that finds the optimum gapped alignment between two sequences. Dog slow, but functional (ha ha...)! Now to see if I can add some heuristics into the algorithm to make it speedier.
OCaml is a lot of fun, I must say. At some point I look forward to making use of OCaml's ability to ship cross-platform bytecode around to different machines. It'd be great to be able to add new alignment views and other analyses directly into FamilyRelationsII simply by downloading some new OCaml code! I've also been thinking about how to use OCaml in my tuple space/map-reduce implementation... seems like a good fit!
Last but not least: WSGI. There is now a Web site containing my Quixote and SCGI adapters for the Python WSGI standard. It also turns out I owe Ian Bicking an apology: when I asked why Webware didn't have an adapter, I'd missed Ian's WSGIKit implementation (SVN here, blog here). It's not an adapter so much as a reimplementation effort, as far as I can tell, so I still think there's room for a simple adapter that Just Works (tm). If experiments continue sucking maybe I'll work on that...
ta for now,
--titus