Older blog entries for gobry (starting at number 32)

python position in switzerland

I'm leaving my current job at the end of june, and this will make a nice open position for a hacker wishing to work on a open-source project written in python, in a very nice area of Switzerland (french-speaking area BTW).

The job is at EPFL (Swiss Institute of Technology), and is about continuing the development of a service called infoscience which provides researchers of more than 200 labs with a common platform for managing their publications. It is not (only) an archive, as it behaves as a web service for plenty of web sites, and also pumps information from external systems.

So basically the job is about balancing the demand for specific tools (for a given lab, for the librarians,...) and the need for a high-level common platform (which already provides conversion services, quality checking tools,...) Part of the software is developed in collaboration with the CDS team at CERN, part is specific to infoscience and its design is completely up to the person in charge of this job.

The detailed offer and the procedure for applying is available on the EPFL jobs website, in french. Feel free to contact me directly for more info.

(and no, I don't leave because it's boring, but because I've been recruited by Google :-))

RSS for software releases?

When I do a new release of a software, I usually just cannot remember in which of the numerous repositories I've registered it... What I would like is some RSS-for-applications standard, where I could put all the information that, say, freshmeat requires from me. Then I could just register the feed once in every repository (or even let the repositories harvest the feeds :-))

Does such a beast already exist?


I received the notification of shipment of my N770 yesterday, with a good surprise: as they don't ship it in Switzerland, they cannot bill me, so they sent it for free! I just hope the box won't be empty... :-D

Web development

Christmas approaches, and I made one more attempt toward a nice web based wishlist system. It's called Mes Souhaits, in french for the moment, but I might translate it one day.

It was a good opportunity to give another try to Nevow, and I loved the experience. It's really clean, very reusable, the overall impression is of very few useless work while designing the code. I also use mechanoid to run some tests on the site.

rmathew, I would certainly second your choice of reading Practical Common Lisp. I was in a similar situation, both reluctant to completely dive in a new language with a so specific feeling, and strongly appealed both by writings I stumbled upon regularely, and by interesting libraries that were lisp-influenced (the generic functions implementation in python being one, see PyProtocols). And the book certainly managed to bring my level of interest way over what is necessary to start working with lisp regularely!

I continued my investigation for what my next programming language should be (after python), and finally took some time to read Practical Common Lisp. It is very pleasant and provides strong arguments as why The Lisp Way is indeed superior to many other approaches: lisp macros, generic functions, conditions and restarts all seem to be elegant and powerful solutions for these cases where your instinct tells you there must be a better way to do that (and I'm mostly a python programmer these days, so I already expected a lot from my language of choice :-))

I wandered a bit in the debian archive, installing sbcl and cmucl to play with the examples in the book (I especially liked the unit-testing framework written in 26 lines :-)). What stroke me is that there seem to be no program in the archive that is fully implemented in lisp (there are plenty of libraries however). Either there is some showstopper I did not find out yet, or I did not search properly in the archive. The fact that you need a runtime to execute your programs is not a valid reason, as it is the case with many other languages now. So, what?

So, for the moment I put Haskell aside, as I feel much more at ease with multi-paradigm languages. Let's see what real project I can find in order to practice a bit...

replication, now (really!)

A long time ago, I wished I could have all my personal data (addressbook essentially) available in read-write mode from any of my work places (linux laptop, home iMac, linux at work), and also shared with my wife.

As this remained a itch to scratch too long, I finally decided to see how far I could solve it myself. So I've taken replication 101 (mostly followed references from the interesting white paper from the unison project), and experimented a bit in Python with simple ideas.

The result fits my needs (I can read and write from several places the system handles propagation and updates, and reports conflicts), but is still far from either complete (I still need to put some sugar on the conflict resolution procedure, to finish the Addressbook.app client,...) or polished (the implementation is certainly not space nor time efficient)

At least I feel better now :-)

BTW, if you want to play with it, it's available here:


It will probably randomly discard your data, crash your network and repaint your bedroom, but if you wish to test it, feel free.

Haskell is driving me nuts. I really like its expressiveness, but lately I had a problem: my short program (a log parser) which used to work with constant memory footprint (thanks to good advices regarding strict data types), started to suck up more than 100Mb again. The culprit seems to be my introduction of a unit test in the module, which steps on the toes of the optimizer. Am I just unlucky, or is any haskell programmer supposed to understand in deep details how the compiler optimizes one's code?

In the meantime, I also digged a bit in the bibliography regarding replication techniques for disconnected devices. So far, I'm looking in the direction of Harmony and Rumor.

17 Dec 2004 (updated 17 Dec 2004 at 09:03 UTC) »
e8johan: this is almost covered up by standard HTTP headers (you can ask for a page if it has been changed since a given date, or check its etag). But indeed, compared to NNTP for instance, it's still not very scalable.

dcoombs: while you're already trained for slow execution, why don't you feed your program to valgrind? it's the tool that saved the day when I was still programming in C/C++... :-) And given what it actually does, it's not _that_ slow.

Job still working on a nice project based on CDSware. It's a good document management platform which is now mainly in Python but with some remaining parts in PHP. The team has a really interesting sensibility regarding high level languages (not only python, but also in the functional family), which helps in thinking in terms of "the right tool for the job", and not in terms of "the hype of the day". They managed to get very good performance in searching almost 1M documents, with complex queries running in less that 1s, by using boolean vectors from Numerical python, serialized in a MySQL database.

Released version 1.0.4 of Garlic. From the outside, it's another web based bookmark managed, but beside being a personal itch I had to scratch, it also serves as a testbed for several things:

- Pybliographer 1.3: this branch uses a bsddb backend to store the data, and has its own file format for exchange. So garlic is useful to test if someone would actually like to code against pyblio-1.3, and if the code works in situations it was not especially designed to handle.

- twisted: in this version of garlic, a companion application is able to parse RSS feeds and insert them in the bookmark manager (in a dedicated folder), via twisted's RPC system. Other features might be added in a similar way, it's just that I really wanted this one for my personal use :-)

- quixote: a simple yet elegant way to generate html code in very natural python. I tested nevow a while ago (nevow is the "official" web templating solution on twisted), but it was evolving too fast at that time for someone that was also learning twisted :-) I'll give it another try soon, as it is really really elegant.

Another area I'd like to explore is some high-level testing framework for GUIs (esp. python-gtk). Certainly a tricky issue, but I really saw no higher-level approach than using X to simulate events. Of course a higher level means probably limiting oneself to a given toolkit. But using signals and playing with the widget tree seems to offer more power.

Don't know how I could mix that into garlic however :-]

forrest: I've updated the entry on the site. There is a moderation process, so the modified entry is not yet displayed.

23 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!