Older blog entries for titus (starting at number 73)

24 Mar 2005 (updated 24 Mar 2005 at 18:58 UTC) »
Paper: published

Finally! The provisional PDF is a bit ugly, mainly due to the oversized figures. O well.

It was a fun 5 months... (finishing writing --> final acceptance.)

More C++ help needed

OK, here's a silly C++ problem I'm having. (Again, it has to do with exceptions -- is there a book that someone can recommend on "practical C++ programming" or something? 'cause Stroustrup is a decent reference but is shite for learning about the nooks & crannies of the language...)

Here's the code:

int line_no;
if (success) {
} else {
   std::string exc_str;

exc_str += "failure at line "; exc_str += line_no; exc_str += "; aborting.";

printf("exception is: %s\n", exc_str.c_str());

throw (my_exception(exc_str)); }

When this code is compiled and placed in a shared library file (by Python distutils on Linux/gcc 3.3.2), I get an odd result: the printf output (and the string passed into

my_exception) is NOT what is constructed in the above 'else' code. In fact, if I do anything other than assign a constant string to 'exc_str' I get essentially random output.

I don't think it's a simple scoping issue, because my_exception is making a new copy of exc_str. I think it's related to the shared-librariness aspect of the code. Is there some gcc flag I (or distutils) am missing?

Oh, and one more question: is using '+' the right way to construct the exception report string? It's kinda ugly.

E-mail me... thanks!

Update: The enigmatically named 'tk' pointed out that operator+ interprets the integer as an ASCII code. Whups. He gave me this bit of code instead:

#include <cstdio> #include <sstream> #include <string>

void foo() { std::ostringstream exc_str; exc_str << "failure at line " << 10 << "; aborting."; printf("exception is: %s\n", exc_str.str().c_str()); }

This is exactly what I was looking for -- thanks, tk!


Paper: accepted

My paper on FRII: accepted. Hooray!

Open Source Science

Via slashdot, Climatology debate goes open-source. Some climate change challengers made their code available, essentially challenging their opponents to do the same. In this case, the argument is actually about analysis techniques, so it's an especially apt move. I enthusiastically applaud these people.

There were many, many, many, many, many moderately interesting comments.

The distinction between showing your code and not showing your code has nothing to do with your funding source. It's all about publication.

Look, if you want to publish your work, then you need to make your data and your analysis techniques completely transparent. Anything less is Not Science. I think this is especially true of software. (Honestly, if your code is too ugly to show other people, it's probably pretty buggy, too.) And I don't care who funded it: I'm not going to favorably review any paper that does some big-ass computation but doesn't make their analysis system available. Closed source makes a multitude of sins possible.

Of course, just because you need to show your work doesn't mean copyright and license restrictions don't apply. I don't think everybody need release under an OSI-certified OSS license; it's enough if I can see and run your code.

Separately, the Wash Post had an interesting article on the meaning of the word "theory", too. Worth reading if you're interested in this stuff. (Via AMK.)

Croquet and Squeeeeeeak

Alan Kay is da bomb. But what is Croquet actually useful for?? The site is vague and general to the point of drooling idiocy! Is this what happens when you start solving the problems of the world? ;)

I'm serious. I'd like to know! It looks cool. Neat. Nifty. What does it do, other than radically realign the feudal nature of interdependence on the Internet?

Software sucks

Released a couple of updates to FamilyRelationsII, Cartwheel, and the whole mess o' software. Man, I've gotta find a way to automatically test that GUI; people keep on finding fairly obvious bugs right after a new release. Sigh.


18 Mar 2005 (updated 18 Mar 2005 at 19:21 UTC) »
Violating laws of physics: bad

I just don't understand. OK, so it's a moderately cool article about some of the strange stuff that goes on in science. But it's riddled with nonsensical statements and exhibits a flawed understanding of the scientific method.

For example, on dark matter: "If I could have my pick, I would like to learn that Newton's laws must be modified in order to correctly describe gravitational interactions at large distances," she [Vera Rubin] says. "That's more appealing than a universe filled with a new kind of sub-nuclear particle."

Err, first of all, you might want to talk about modifying the gravitational theory of the 20th century rather than the gravitational theory of the 19th century. General relativity, anyone?

update: fzort points out the MOdified Newtonian Dynamics pages, which look pretty reasonable. (IANAP, though.) Still... my point about Occam's razor stands ;).

Second of all, anyone who thinks that modifying the laws of nature is more desirable than discovering a new slot in the current paradigm doesn't understand Occam's Razor.

(These are almost certainly misquotes; I'm sure Vera Rubin is a good scientist who understands what she's doing.)

But the article really takes the pot with its mention of cold fusion and homeopathy. Here's a hint: if your theory requires a change to prevailing paradigms, the burden of proof is on you. This is a point that needs to be emphasized, because it's a simple way to distinguish crackpots from scientists. Crackpots make a few measurements, assert that they've discovered a new paradigm, and then argue about it for the rest of their lives. Scientists make a lot of measurements, assert that there's nothing new but that they still need to explain a few things. Then, at some point, enough different scientists find a hole that there's a general belief in the need for a new theory. Read up on Einstein and the photoelectric effect if you want to see how quantum mechanics went through this process.

Conspiracy theorists point to a few standard tropes as evidence that scientists are part of some gigantic cabal: general acceptance of global warming, general rejection of cold fusion, homeopathy, and creationism. What do all of these things have in common? One commonality -- I've said this before -- is that these areas aren't generating testable hypotheses. In the case of climatology, this is because climate prediction is really effin' hard. In the case of cold fusion, it's because the true believers in cold fusion won't open up their methods to be tested. In the case of creationism, there simply aren't any testable hypotheses. And in the case of homeopathy, negative results simply aren't noticed by the true believers.

I find it almost amusing that people believe scientists are part of some gigantic cabal. The only thing most scientists are united in is the belief in the scientific method. Everything else is up for grabs & hence is the source for unending arguments.

For the proper way to deal with something like cold fusion, check out this article on cavitation and table-top fusion. Yeah, that's right -- the authors actually published their work and are looking for other people to confirm it. It's called "science", folks...

fumingly yours,

Camino: da bomb

Day 3 of using Camino. No crashes, fast as ever, still happy. Only complaint: why do Safari, Firefox, and Camino all have different keys for switching between tabs!?

Garbage Collection: Python vs C++

Chui Tey gives a nice example of how something that works in C++ wouldn't work in Python.


Spent some time bouncing around del.icio.us last night. (That's almost as hard to type as "laphroaig"!) Sooooo, is del.icio.us here to stay? Didn't seem too active when I was there: a lot of the "most popular" pages were empty, including the one for Python (which is now populated by one entry). Maybe I just don't understand.

"python" vs ...

ncm points out that it can be informative to look at "python sucks" as well as "python" to get an idea of language "popularity". Interesting results.

Murphy's Computer Laws

This came across the IP list recently. My favorite is #10, "The number one cause of computer problems is computer solutions."

Speaking of "laws"...

Two new rules o' thumb. First:

Anyone using "quantum" and "biology" in the same sentence is full of it, unless the sentence is "quantum mechanics has nothing to do with biology".

Corollary: "quantum evolution" is bullshit. (Sorry, Jean ;).


Patterns in genomes aren't (scientifically) interesting unless you have a specific, testable hypothesis about their meaning.

Yes, we already know that genomes are highly non-random. So what? What does it mean?



Safari --> Firefox --> Saf... no, Camino

Camino latest is blazingly fast, doesn't sit & spin, has no focus issues, and is very pwetty. Highly recommended.

Switching between three browsers is a bitch. Most of my bookmarks are on private 'links' page, but I didn't realize how much I depended on saved user/pass info. Ever since I settled on my iBook as my portal to, well, everything -- an 80x40 terminal window running screen, Web browser, and X server satisfy roughly 99% of my needs -- I've been letting my browser remember my login info. Now I need to have sites send the passwords to me, and in some cases they randomize 'em. Argh. In one extreme case -- my local library -- I'm going to have to go visit the library to get a new password.

C++ sweetness

Proper use of C++ sure is nice & clean. I'd still prefer the cleanliness of try/finally, but yesterday's global interpreter lock class is useful enough that it should be put somewhere for other people to find. I wonder if I could convince GvR etc. to put it in the (currently very short) writing extensions in C++ docs? I couldn't find a place in the Python cookbook for what is, technically, a C++ recipe...

Thanks to Chris Frey, Peter Hart, and Max Caceres for their help on this!

My C++ code is beginning to resembly my Python code. (It's still uglier, of course ;). By and large I can do lots of stuff in small amounts of code, and any real ugliness can be hidden in short, easily-tested functions in the implementation file.

Anyway, I'm at a 1.0rc1 release for paircomp, now that I've got error reporting working. I've also made khmer 0.2 available. It's a simple, fast k-mer counting program for whole-genome k-mer statistics. I'm always surprised at how fast you can do the simple stuff: khmer can count all 12 bp words in a 5mb genome in less than a second. Now to try it out on human (600 times larger)... ;)


13 Mar 2005 (updated 13 Mar 2005 at 22:58 UTC) »
Safari --> FireFox and back again

I tried out Firefox on my iBook, because Safari was spinning the little wheel too much. Firefox isn't much better, and has a number of focus problems. Plus it doesn't look nearly as pretty. YMMV.

I'll try out Camino when it hits 0.9...

Fun with Python segfaults

Can anyone spot what's wrong with this C++/Python wrapper code?

PyObject * ret;


try { long val = heavy_computation_stuff(); ret = PyInteger_FromLong(val); } catch (program_exception & e) { PyErr_SetString(PyExc_Exception, "whoops, I got broke"); }


return ret;

(It segfaults.)

I'll give you a hint: it has to do with the global interpreter lock.

Oh, and there are actually two bugs in the code, but only one

actually causes a crash.


Well, I'm sure you're on tenterhooks now, so I'll give you the answer to the guaranteed segfault: you need to wrap PyErr_SetString in Py_BLOCK_THREADS/Py_UNBLOCK_THREADS.


OK, well, the other bug is the same thing, it just doesn't cause problems in this particularl instance: PyInteger_FromLong also needs to be wrapped in Py_BLOCK_THREADS/Py_UNBLOCK_THREADS.

I forgot the cardinal rule of the GIL: any time you access Python code, you need to turn threads off.

ARRRGGGH, I swear this took me the better part of a day to figure out.

But now I'm stuck. Without try/finally in C++ I can't guarantee cleanup if I do an ALLOW_THREADS in the try block. And it'd be severely ugly (not to mention moderately error-prone) to set a flag when an exception is raised, e.g.

try {
} catch (...) {
   exception_raised = true;
if (exception_raised) { END_ALLOW_THREADS; }

(Yeah, I'd need to redefine the macros to make this work anyway.)


Chris Frey and Peter Hart pointed out that you can get the same functionality as try/finally by using classes. So my solution now looks like this:

try {
   { py_thread_saver save;
   val = long_computation();
   ret = PyInt_FromLong(val);
} except (...) {

Since the py_thread_save object is an automatic variable, it gets destroyed at the end of the code block.

The py_thread_saver class is pretty simple:

// a class to automatically handle saving of thread state. class py_thread_saver { protected: PyThreadState * _tstate; public: py_thread_saver() { _tstate = PyEval_SaveThread(); } ~py_thread_saver() { PyEval_RestoreThread(_tstate); } };

there are a bunch of good variations on this that can fit more complicated scenarios, but this solves my problems perfectly. Thanks, guys!


Publications and open source

My paper on FamilyRelations etc. has finally been reviewed; both reviewers liked it, although one requested some clarifications before final acceptance. I'm fixing it today and will send it in within the week; assuming I don't screw up the revisions, it'll be out by Apr 1st. No mention of the "publicly available software isn't original any more" nonsense of BioTechniques.

Another paper, Anaerobic regulation by an atypical Arc system in Shewanella oneidensis, was accepted last week. This was work done along the lines of my earlier paper on finding binding sites in microbes. In this case, someone in a local lab tested the results of a different search and showed that my search was moderately predictive of function. (online materials here)

A multitude of motifs

Spent some time working out some simple math on motif finding. Will talk about it in more depth when I have the energy ;).

British books, again

Alistair Reynolds has a new book "Century Rain" (ref). Jon Courtenay Grimwood has a new book "Stamping Butterflies" (ref). Richard Morgan has a new book, "Woken Furies" (ref). Iain Banks, John Meaney, and Steven Erikson all have new books out, too. What do all of these people have in common? They're published in England, so I can't get them without paying $exorbitantly$ for shipping.

I may see if Munro's Books can special order any of these and then tranship them to me in the US. Hefty price tag, tho, to buy that many books at once. Sigh.


Meeting up early

Had our SoCal python interest group meeting last night; 7 people showed up. Very interesting! Meeting in person is higher bandwidth than talking online ;).

Grig gave his PyCon talk on Agile Testing methodologies. Very thorough presentation; lots of good software out there. Someone should give his PyUnitPerf software a try already! I won't give you the punchline of his talk, you should go to PyCon (or read his blog)...

I gave a short demo of my supercalifragilisticexpialidotious side project on annotating URIs. Natch, my laptop died just as I was setting up. 1st time in months. What is it about demos!? Anyway, good thing that it was mostly a Web demo, so I could swipe someone else's computer, and the crash prevented me from showing my PowerPoint. Probably a good thing there, too. People were very nice and enthusiastic about the possibilities.

The pizza was good, too.

Resolved: we will advertise more widely. We will do some intro talks (people wanted to hear about our experiences with Quixote, I wanted to hear about Greg McClure'sexperiences with CherryPy, Daniel Arbuckle was queried re metaclasses). We may meet at the dank & deserted marine lab again. We discussed a couple of ideas for community participation in Python stuff, too. More anon.


RIP: Hans Bethe

July 2nd, 1906 to March 6th, 2005.

Hans Bethe died last night at dinner, at the age of 98. He was one of the 20th century's greatest physicists; among his other accomplishments he received the Nobel Prize for describing the H --> He conversion that fueled our sun. Most physicists were probably surprised to learn that he was still alive; he was literally responsible for laying much of the groundwork in atomic and nuclear physics in the 1930s, and contributed immensely to many different areas of physics throughout the century.

He also collaborated closely with my father for almost 30 years. Some of their work is still moderately controversial (e.g. low mass black holes). He was hoping to live to see LIGO confirm some of their latest theories on neutron-star binary mergers, but that was not to be.

For many years (~1985-2000) he and my father travelled out to California to work at Caltech for a month each January. I got to know him a bit during those months, because he and often his wife Rose would stay with my father in the same apartment. He was always very mentally active, even as his physical abilities declined over the years. It was always tricky doing things like picking him up at the airport, because you wanted to be careful with this living legend! I knew that if I had an accident with him in the car, I'd be infamous throughout physics...

My friend Chris Adami has written a book called "Three Weeks with Hans Bethe and Gerry Brown", describing a short period in 1992 that Chris spent with Hans and my father. It captures Hans' intellectual depth and conversational style perfectly. I hope it will be published soon.

Hans is one of two or three people directly responsible for my entry into biology. He told me that when young scientists asked him what field he would go into were he starting in science now, he would emphatically respond "Biology!" He believed that biology would be the field with the next big achievements, and -- as always -- he was right.

I will miss him.


p.s. Wikipedia, as usual, is up to date...


Wrote another toy WSGI application tonight: wsgiFeedSuck.py. It wraps RSS feeds with a simple WSGI app that displays the titles and summaries.

You can try it out, for the nonce, in CGI mode: here.

I wrote wsgiFeedSuck to learn how to use Mark Pilgrim's excellent feedparser module. Astute examiners of the code will note that I use 'etag', 'modified', AND only check the feed every hour. Yay me ;).

The only real problem I have with the code is the lack of file locking around the shelving. O well. Suggestions welcome.


OK, I'm posting it. Happy? Damned voices, yammering away in my head...

(I do like the fact that Web-based darcs repositories are also Web sites in their own right. Very convenient when you don't want to do any work to "release" something.)


oubiwann, congrats on remembering to X out your password. I didn't, the first time I posted such a script ;).

chromatic, nice enthusiastic article on PostgreSQL. (I especially like the Oracle user's comment at the bottom: "but in our nice shiny expensive database, we've been using this for eons...")

avrietta, I couldn't agree more. At least about the steak. And maybe the single malts. But seriously, these jokers are running Wikipedia on non-ACID databases!? Whoo. You should beat up on me anyway, though, I like my tri-tip marinated. (I can't afford better cuts of meat.) But I do sear it. A few BBQs ago, I let a German cook the meat -- he kept on telling me it wasn't done, until finally I realized he was "searing" it all the way through. He'd already ruined it by then, but luckily he tasted good with the cajun BBQ sauce I like, so I didn't go hungry for long.

robocoder, your wedding site is down. You should use a co-loc. ;) Ummmm and you should also keep it updated...


64 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!