Older blog entries for titus (starting at number 78)

4 Apr 2005 (updated 4 Apr 2005 at 07:33 UTC) »

twill

Just pushed twill 0.6 out the door.

twill is a simple reimplementation of PBP, based on various Python hackage. You may recall from my earlier ravings that PBP is a system for automating simple tests of Web sites. Once I started using PBP a lot, I generated several patches that fixed some basic behavior, but the maintainer was busy with other things and couldn't spend the time to babysit my patches.

Sooooo, long story short, rather than maintaining my own fork of PBP, I decided to re-engineer it from scratch. Ian Bicking, a fellow sufferer, pointed out that IPython might be a good platform on which to build such a package, and I thought that sounded like a good idea.

The result is twill (yes, another random name).

My needs were pretty simple: I want to be able to automate tests of sites on the complexity level of the Quixote demo (form submit, link following, etc., but no Javascript). I also want to be able to automate dumb stuff like discarding all the spam that piles up in some of my more defunct SourceForge project mailing lists. And, last but not least, I want to be able to use it in unit-tests.

twill fulfills all of these needs as of this release. If you're interested you can see two simple Quixote demo test scripts, quixote-demo

and quixote-demo-2.

I did end up moving away from IPython a bit. Yes, there's an interactive mode that uses IPython, but it is, in fact, impossible to intercept exceptions in an InteractiveShell. This is necessary when processing scripts: you need to be able to catch exceptions and then exit the script without continuing... I also found IPython large, unwieldy, and full of special cases. I don't want to shoehorn my own in there ;).

So twill is basically a package that offers a thin wrapper around mechanize in the form of a Python library, along with some simple tools to help with program execution (autoquoting in interactive sessions and scripts both). It's very similar to PBP in style, but I think it will be easier to grow.

Let me know if you try it out and have any comments. I'm going to wait for 0.7 or 0.8 to post it to c.l.p.announce, but early early adopters are always welcome ;).

Why emacs and vim?

Mark Williamson asked why emacs and vim?

I use emacs for programming -- one persistent emacs window, many buffers -- and vim for answering e-mail. I like vim because the typeahead is really simple and obvious. I like emacs because I know how to deal with multiple buffers by keystroke, it has all of the commands documented, and I've been using it for over a decade. Decade and a half, now, come to think of it.

So why should I change?? What compelling new feature(s) am I missing out on?

--titus the old fogy ;)

2 Apr 2005 (updated 2 Apr 2005 at 07:53 UTC) »

Conference skullduggery

So this morning, after cleaning out all the spam in my inbox, I discover a Call for Papers for the Fifth Virtual Conference on Genomics and Bioinformatics. It was bounced from the alife-announce mailing list, and then sent on to me (as the moderator) by the original submitter with the request that it be posted.

So, humdedum, I'm scanning down the page thinking "gosh, yet another conference on genomics; look at all of the people I don't recognize on the committee, whee". And then my eyes stop on the name of my advisor, Eric Davidson. And I think to myself, "that's weird, he doesn't usually go in for these conferences." So I step across the hallway and ask him if he knows anything about it. Blank stare. I ask his assistant (who would be dealing with this anyway) and she says she's never heard of it.

So these, these people just dumped his name on this conference announcement without any sort of participation on his part. Well, I guess that's a sign you've made it: your name is placed on conference announcements without your consent, presumably to sucker people into attending the conference.

I'm not really sure what to do. So far I'm just ignoring the e-mail. If the guy asks me to post it again, I'll ask him what the hell my advisor's name is doing on it, I guess.

Enough already

OK, it's April 1st, I get it...

In other news, it's hard to write a diary entry and work on a thesis. I need to do some therapeutic programming...

--titus

p.s. Ankh, I don't believe gay marriage is legal in Las Vegas. See above ;).

On socialism, I have a quote for you: "Any man who isn't a socialist at twenty has no heart. Any man who's still a socialist at thirty has no brain." (Sean Stewart, in Perfect Circle) It actually sums up a surprising number of people I know, although of course there are also many exceptions.

It's also interesting to note that "Red China" isn't particularly communist any more. I know several mainland Chinese here at CIT and they actually tell me it's not a communist state any more; for one surprising example, medicine isn't socialized there. Weird!

30 Mar 2005 »

Python Spotting

A few particularly nice posts on the Python Web-SIG recently, in response to Ian Bicking's request for info from Python hosting providers.

First, Bill Janssen talks about how stable (pure) Python is for long-running processes.

Then Remi Delon writes about the multiple-package problem, and how every Python Web app uses some different mixture of 5-10 packages. It sounds like this wouldn't be so much of a problem for Remi if those packages weren't sometimes difficult to build, but this is a real problem for larger or less specialized hosting companies. After all, they need a fairly standardized install & upgrade that works across different people's applications!

It may be that WSGI adapters and the new 'egg' stuff will help with this... but this is an area where a Python CPAN might help.

Separately, I'm sure everyone saw this interview with Jonathan Rentzsch, but people may not have gotten more than 30 or 40 pages into it. There was some discussion of Python in there: here's a great quote from the middle.

... That said, my eye isn't on Java or C# to put the knife in Objective-C's back -- it's Python. Python addresses all the issues and PyObjC makes it easy to use Python with Cocoa. Indeed, PyObjC can do things Objective-C can't do by itself.

Of course immediately afterward there's the obligatory mention of how SLOW Python is, pushed by the interviewer. I dunno why this keeps on coming up; maybe people are trying to use Python for everything and running into speed problems that I haven't seen. Personally I use Python for whole-genome data crunching and speed is just not a problem.

More C++ book recommendations

A few days ago, I asked for good C++ books and got some good-looking recommendations. Ben Bornstein, a local friend who seems to periodically drop by this diary, just recommended two more: C++ Common Knowledge and Modern C++ Design. I already own the latter but never got into it; obviously it's time to give it a shot ;).

HTTP/SMTP libraries

OK, so clearly I was unclear...

Yesterday I asked about small HTTP/SMTP libraries. dfenwick suggested vmime, which is SMTP-capable. Jacob Smullyan also pointed out the obvious: neon, libwww, and curl. (He also pointed out the rather cool-looking SWILL, a system for driving non-Web apps via the web</a>.)

Of the four solutions suggested above, I like libwww the best, although it would encumber my distribution a bit because of the publicity requirements. ("This application uses...") But I really didn't explain why I needed this in the first place.

My primary bread'n'butter application is FamilyRelationsII, a GUI app written in FLTK that runs on Windows and OS X (well, and Linux too). I distribute it in binary form, primarily to biologists.

Unfortunately it occasionally crashes, usually via an assertion that I've put in there for sanity-checking reasons. Sometimes the assertion is caused by my stupidity, and occasionally it's caused by the genomic data. Either way, assertions happen. And whether or not it's an assertion, I want to know that it happened. And if it is an assertion, I want to get the assertion message and some associated information.

The intended user population is completely computer a-literate, to all intents and purposes. Moreover, the program is usually run from the Finder or the Explorer. This means that error messages usually don't get echoed anywhere, and even if they were saved to a log it'd be an uphill battle to have the user sent to me.

So I want the program to send it to me.

Hence SMTP/HTTP. I run a bunch of Web servers and mail servers, and I want automatic reporting to one of them. In particular, I want a thin library that can be entirely included in the source code; hence LGPL. I only need simple POST or sending capabilities, nothing fancy. I'd prefer to use a library that's smaller than my application, which is the problem with vmime and possibly the others (it may be possible to break libwww out into separate smaller pieces, where I can pick or choose what I want.)

Or, there might be an alternative solution... The only time I can remember seeing this kind of thing is with TalkBack, which lets mozilla.org know when your mozilla dies. Is there anything simple and automatic like this that doesn't require any user input?

thanks,
--titus

28 Mar 2005 (updated 28 Mar 2005 at 22:55 UTC) »

Joel on Software

A few interesting articles from Joel on Software. Fun reading.

Standalone, F/OSS SMTP or HTTP client lib for C/C++?

Hey, does anyone know of a nice, simple SMTP (preferably) or HTTP client library for C/C++? It needs to be L/GPL compatible... let me know, thanks!

Google so far found this but it requires some modification. I'm really looking for something like Python's smtplib. I need to include it in a L/GPLed application, so it should be pretty small as well as L/GPL compatible. (The goal is to do error reporting with it.)

thanks,

--titus

25 Mar 2005 (updated 25 Mar 2005 at 18:39 UTC) »

distutils, again

Earlier, I complained about my C --> Python extensions not depending on their libraries.

But hey, it turns out -- upon reading the distutils source code -- that the (very smart) distutils people thought of this:

# the c++ extension module (needs to be linked in with libmotility...) extension_mod = Extension("motility._motilitymodule", ["_motilitymodule.cc"], include_dirs=['../src',], library_dirs=['../src',], libraries=['motility', 'stdc++'], -----> depends=['../src/libmotility.a',])

It turns out that this is, in fact, documented. O well, sometimes it's just easier to read the source code...

I wonder, though: wouldn't it make sense to have libraries automatically added to depends? There might be library search path problems -- I think the library search path is platform-specific -- but you could always just look for them in the specified library_dirs.

Squeeeeeeeeeeeeeeeeeak & Croquet

This tutorial on Croquet is very cool and answers many of my questions about what use Croquet is. To make use of it I have to learn Smalltalk, or I can wait until they have a Python API... Thanks for the reference, nymia!

The Story of Squeak is pretty cool, too.

C++ book recommendations

Marius Gedminas said:

I recommend Scott Meyer's _Effective_C++_ and _More_Effective_C++_ if you want good practical C++ books. There are also some good C++ resources on the Internet: Bjarne Stroustup's C++ Style and Technique FAQ at http://www.research.att.com/~bs/bs_faq2.html, Marshall Cline's C++ FAQ LITE at http://www.parashift.com/c++-faq-lite/, Herb Sutter's Guru of the Week series at http://www.gotw.ca/gotw/. When I used C++ I also appreciated articles by Andrew Koening, Barbara Moo, Scott Meyers, Herb Sutters in Dr. Dobb's Journal and in C/C++ Users Journal.

Guru of the Week looks particularly interesting; I like byte-sized chunks of info!

Kent Johnson also recommended "Effective C++". That may be a good starting point.

Thanks, guys. I guess posting lame C++ problems every week or so is a good way to get help ;).

--titus

24 Mar 2005 (updated 24 Mar 2005 at 18:58 UTC) »

Paper: published

Finally! The provisional PDF is a bit ugly, mainly due to the oversized figures. O well.

It was a fun 5 months... (finishing writing --> final acceptance.)

More C++ help needed

OK, here's a silly C++ problem I'm having. (Again, it has to do with exceptions -- is there a book that someone can recommend on "practical C++ programming" or something? 'cause Stroustrup is a decent reference but is shite for learning about the nooks & crannies of the language...)

Here's the code:

int line_no;
if (success) {
   ...
} else {
   std::string exc_str;

exc_str += "failure at line "; exc_str += line_no; exc_str += "; aborting.";

printf("exception is: %s\n", exc_str.c_str());

throw (my_exception(exc_str)); }

When this code is compiled and placed in a shared library file (by Python distutils on Linux/gcc 3.3.2), I get an odd result: the printf output (and the string passed into

my_exception) is NOT what is constructed in the above 'else' code. In fact, if I do anything other than assign a constant string to 'exc_str' I get essentially random output.

I don't think it's a simple scoping issue, because my_exception is making a new copy of exc_str. I think it's related to the shared-librariness aspect of the code. Is there some gcc flag I (or distutils) am missing?

Oh, and one more question: is using '+' the right way to construct the exception report string? It's kinda ugly.

E-mail me... thanks!

Update: The enigmatically named 'tk' pointed out that operator+ interprets the integer as an ASCII code. Whups. He gave me this bit of code instead:


#include <cstdio> #include <sstream> #include <string>

void foo() { std::ostringstream exc_str; exc_str << "failure at line " << 10 << "; aborting."; printf("exception is: %s\n", exc_str.str().c_str()); }

This is exactly what I was looking for -- thanks, tk!

--titus

23 Mar 2005 »

Paper: accepted

My paper on FRII: accepted. Hooray!

Open Source Science

Via slashdot, Climatology debate goes open-source. Some climate change challengers made their code available, essentially challenging their opponents to do the same. In this case, the argument is actually about analysis techniques, so it's an especially apt move. I enthusiastically applaud these people.

There were many, many, many, many, many moderately interesting comments.

The distinction between showing your code and not showing your code has nothing to do with your funding source. It's all about publication.

Look, if you want to publish your work, then you need to make your data and your analysis techniques completely transparent. Anything less is Not Science. I think this is especially true of software. (Honestly, if your code is too ugly to show other people, it's probably pretty buggy, too.) And I don't care who funded it: I'm not going to favorably review any paper that does some big-ass computation but doesn't make their analysis system available. Closed source makes a multitude of sins possible.

Of course, just because you need to show your work doesn't mean copyright and license restrictions don't apply. I don't think everybody need release under an OSI-certified OSS license; it's enough if I can see and run your code.

Separately, the Wash Post had an interesting article on the meaning of the word "theory", too. Worth reading if you're interested in this stuff. (Via AMK.)

Croquet and Squeeeeeeak

Alan Kay is da bomb. But what is Croquet actually useful for?? The site is vague and general to the point of drooling idiocy! Is this what happens when you start solving the problems of the world? ;)

I'm serious. I'd like to know! It looks cool. Neat. Nifty. What does it do, other than radically realign the feudal nature of interdependence on the Internet?

Software sucks

Released a couple of updates to FamilyRelationsII, Cartwheel, and the whole mess o' software. Man, I've gotta find a way to automatically test that GUI; people keep on finding fairly obvious bugs right after a new release. Sigh.

--titus

18 Mar 2005 (updated 18 Mar 2005 at 19:21 UTC) »

Violating laws of physics: bad

I just don't understand. OK, so it's a moderately cool article about some of the strange stuff that goes on in science. But it's riddled with nonsensical statements and exhibits a flawed understanding of the scientific method.

For example, on dark matter: "If I could have my pick, I would like to learn that Newton's laws must be modified in order to correctly describe gravitational interactions at large distances," she [Vera Rubin] says. "That's more appealing than a universe filled with a new kind of sub-nuclear particle."

Err, first of all, you might want to talk about modifying the gravitational theory of the 20th century rather than the gravitational theory of the 19th century. General relativity, anyone?

update: fzort points out the MOdified Newtonian Dynamics pages, which look pretty reasonable. (IANAP, though.) Still... my point about Occam's razor stands ;).

Second of all, anyone who thinks that modifying the laws of nature is more desirable than discovering a new slot in the current paradigm doesn't understand Occam's Razor.

(These are almost certainly misquotes; I'm sure Vera Rubin is a good scientist who understands what she's doing.)

But the article really takes the pot with its mention of cold fusion and homeopathy. Here's a hint: if your theory requires a change to prevailing paradigms, the burden of proof is on you. This is a point that needs to be emphasized, because it's a simple way to distinguish crackpots from scientists. Crackpots make a few measurements, assert that they've discovered a new paradigm, and then argue about it for the rest of their lives. Scientists make a lot of measurements, assert that there's nothing new but that they still need to explain a few things. Then, at some point, enough different scientists find a hole that there's a general belief in the need for a new theory. Read up on Einstein and the photoelectric effect if you want to see how quantum mechanics went through this process.

Conspiracy theorists point to a few standard tropes as evidence that scientists are part of some gigantic cabal: general acceptance of global warming, general rejection of cold fusion, homeopathy, and creationism. What do all of these things have in common? One commonality -- I've said this before -- is that these areas aren't generating testable hypotheses. In the case of climatology, this is because climate prediction is really effin' hard. In the case of cold fusion, it's because the true believers in cold fusion won't open up their methods to be tested. In the case of creationism, there simply aren't any testable hypotheses. And in the case of homeopathy, negative results simply aren't noticed by the true believers.

I find it almost amusing that people believe scientists are part of some gigantic cabal. The only thing most scientists are united in is the belief in the scientific method. Everything else is up for grabs & hence is the source for unending arguments.

For the proper way to deal with something like cold fusion, check out this article on cavitation and table-top fusion. Yeah, that's right -- the authors actually published their work and are looking for other people to confirm it. It's called "science", folks...

fumingly yours,
--titus

16 Mar 2005 »

Camino: da bomb

Day 3 of using Camino. No crashes, fast as ever, still happy. Only complaint: why do Safari, Firefox, and Camino all have different keys for switching between tabs!?

Garbage Collection: Python vs C++

Chui Tey gives a nice example of how something that works in C++ wouldn't work in Python.

del.icio.us

Spent some time bouncing around del.icio.us last night. (That's almost as hard to type as "laphroaig"!) Sooooo, is del.icio.us here to stay? Didn't seem too active when I was there: a lot of the "most popular" pages were empty, including the one for Python (which is now populated by one entry). Maybe I just don't understand.

"python" vs ...

ncm points out that it can be informative to look at "python sucks" as well as "python" to get an idea of language "popularity". Interesting results.

Murphy's Computer Laws

This came across the IP list recently. My favorite is #10, "The number one cause of computer problems is computer solutions."

Speaking of "laws"...

Two new rules o' thumb. First:

Anyone using "quantum" and "biology" in the same sentence is full of it, unless the sentence is "quantum mechanics has nothing to do with biology".

Corollary: "quantum evolution" is bullshit. (Sorry, Jean ;).

Second:

Patterns in genomes aren't (scientifically) interesting unless you have a specific, testable hypothesis about their meaning.

Yes, we already know that genomes are highly non-random. So what? What does it mean?

Anyway.

--titus

14 Mar 2005 »

Safari --> Firefox --> Saf... no, Camino

Camino latest is blazingly fast, doesn't sit & spin, has no focus issues, and is very pwetty. Highly recommended.

Switching between three browsers is a bitch. Most of my bookmarks are on private 'links' page, but I didn't realize how much I depended on saved user/pass info. Ever since I settled on my iBook as my portal to, well, everything -- an 80x40 terminal window running screen, Web browser, and X server satisfy roughly 99% of my needs -- I've been letting my browser remember my login info. Now I need to have sites send the passwords to me, and in some cases they randomize 'em. Argh. In one extreme case -- my local library -- I'm going to have to go visit the library to get a new password.

C++ sweetness

Proper use of C++ sure is nice & clean. I'd still prefer the cleanliness of try/finally, but yesterday's global interpreter lock class is useful enough that it should be put somewhere for other people to find. I wonder if I could convince GvR etc. to put it in the (currently very short) writing extensions in C++ docs? I couldn't find a place in the Python cookbook for what is, technically, a C++ recipe...

Thanks to Chris Frey, Peter Hart, and Max Caceres for their help on this!

My C++ code is beginning to resembly my Python code. (It's still uglier, of course ;). By and large I can do lots of stuff in small amounts of code, and any real ugliness can be hidden in short, easily-tested functions in the implementation file.

Anyway, I'm at a 1.0rc1 release for paircomp, now that I've got error reporting working. I've also made khmer 0.2 available. It's a simple, fast k-mer counting program for whole-genome k-mer statistics. I'm always surprised at how fast you can do the simple stuff: khmer can count all 12 bp words in a 5mb genome in less than a second. Now to try it out on human (600 times larger)... ;)

--titus

69 older entries...

New Advogato Features

FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!