Recent blog entries for thomasd

A new year, and another new attempt at keeping a diary! This is going to be a year of change for me. Most importantly, I'm getting married this autumn -- very exciting, but probably means I need to start some more serious planning soon. Also...

Geek stuff: the dying hours of 2003, I finally gave in and ordered an Apple laptop. This has been a long and fairly difficult decision. On the positive side, Apple have a very attractive (technically as well as cosmetically) OS, and seem to be one of the few companies interested in making laptops which officially support any flavor of Unix. But I do have a few reservations about switching back to a part-proprietary OS. Anyway, it's going to be a few weeks before the machine actually arrives. In the mean time, I'm looking at software issues, and especially the various versions of Mozilla available for OSX.

Science: I'm going back to work tommorow, and feel invigorated by the Christmas break -- lots of new ideas to try. My plan for the next few months is to apply independent component analysis to genome data. The big question is what does `mixing' mean when you're talking about text? Right now, I'm just looking at word frequencies (which seem to be working pretty well on some toy examples), but I think this is going to take me further in the direction of language modelling. Should be interesting!

Hmmm... Long time no diary entry. But mstevens has complained, so here I am, back again...

The last few months have been a mad whirl. Having not been to any conferences at all last year, four in a row this summer was maybe a bit much, especially since I was joint organizer of one of them. But I've learned a lot from them, and spread the word about my projects.

BioJava: Well, I'm now officially `in charge'. At least, to the extent that anyone is. Having root on the server will be useful from time to time, anyway ;-). It's now almost two years since we started, and it's been amazing to watch the developer base grow over the last year. We're also getting one or two people who say they specifically want to write documentation. Wow.

Looking through the list of new stuff since 1.1, we're not well overdue for another major release. Time to see if anyone pays any attention if I call for a couple of weeks of feature-freeze...

DAS: World domination is nigh!

Science: reminder to self: you've still got a PhD to finish. My first gene regulation paper has been written for a month or so now. Now I just need to get it off my supervisor's desk and into a journal editor's office. Hmmmm, could be tricky...

Other stuff: going to a wedding on Saturday, so no chance for a quiet weekend here. It'll actually be the first wedding I've ever been to. None in my family since before I was born, and all my friends seem to have sucessfully avoided it until now. Wish me luck!

Got the design patterns for Java-EnsEMBL reasonably solid in my mind now. This just leaves lots of coding to give nice Java classes for each EnsEMBL data type. I can't help feeling that it ought to be possible to autogenerate a lot of this from the SQL schema. But realistically, there's just enough `exciting' behaviour which has to be woven in that autogeneration probably isn't an option at the moment. Heigh ho (and lets hope the objects don't need to change too much in the future...)

UK under water: Yesterday morning, the lake at the Sanger centre had grown considerably. By lunchtime, one of the overflow car parks was flooded, and in the middle of the afternoon, a message went round saying that one of the transformer rooms was threatened. The network was taken down, and large parts are still missing (well, actually maybe not such large parts, but since we can't access some vital NFS servers...). Fortunately, there's power for my laptop and a connection to the outside world.

All quite irritating, since it means I don't stand much chance of accessing a database I'm supposed to be working with. But I guess it's a chance to catch up with e-mail and some coding loose ends.

PS. Rumour has it that a small island has just reappeared in the middle of the (still swolen) lake. Maybe things are getting back to normal.

Don't seem to have got very much done this week -- possibly due to having to give several presentations in quick succession. I guess the good side of this is that I should have to present anything else for some time to come. Or at least, that's the theory.

Software: Just got a new build of Mozilla running on my alpha box, and it rocks. Last month, it was dying horribly in hard-to-pin-down 64-bit-only ways. Now it seems as stable as it is on my Linux/intel machine. Well, I guess this means the end of my happy (yeah, right) relationship with Netscape Classic.

I've also been experimenting with a development snapshot of PostgreSQL. Specifically, the new 7.1 feature which finally removes the block-size limit for table rows. I can now happily dump hundreds of kilobytes of genome sequence into normal `text' attributes with no fiddling at all.

Coding: Found a bit more time to work on the Java EnsEMBL API. It's now connecting to the database and I can fetch a few object types. Hopefully have it complete enough to be useful for me in the next few days, then I can concentrate on getting it ready to demonstrate to the other EnsEMBL developers. Guess it'll be a while before the project really moves away from Perl, though.

Excursions: Went to the Imperial War Museum on Saturday. Lots and lots of aeroplanes (plus various other kinds of vehicle). One of the exibits was an old (late 1940s, I think) radar guidence system for an anti-aircraft gun. How did they do that without microprocessors? If it were designed today, you can bet that it would be running NT or somesuch.

WebDAV: Had a browse through the specs (plus the delta-V versioning protocol), and it looks good. Looking forward to subversion even more now.

After a lot of digging, and a recompile of gcc, I think I've finally got to the bottom of the Java/C++ on Tru64 Unix problem. If you're using threads, you have to compile absolutely everything with _REENTRANT defined. I guess that makes sense, but when there isn't a reentrant version of the standard C++ library, it doesn't make life easy... Wish I could stick to Linux.

Homework: Read up on WebDAV.

Spent much of the day trying to get Java native methods working reliably on Tru64 Unix. Conclusion: if you want to use C++ and iostreams, forget it.

Have been looking a bit at subversion. I've been using CVS very heavily for the last year, but this looks really promising. Might try grabbing a copy over the weekend to see how development is going.

Added exception support to the bytecode library. Still think the .class file format is wierd, but I've got quite a few of the wrinkles hidden behind nice library code now. Ought to get round to writing some web pages explaining what's going on.

Gradually making progress with the EnsEMBL API.

Largely been focusing on my research projects for the last few days. Some interesting new strands coming together -- hopefully find the time to write it all up some day soon.

Friday 13th today -- wonder how my luck is going...

Coding:Talked about the design of a Java API for EnsEMBL. I'll start coding this up over the weekend, hopefully, and it might reduce the chances of me having to write more Perl.

7 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!