Older blog entries for malcolm (starting at number 100)

My List(s) Of Working Programmer's Books

A couple of weeks ago, Bill de hÓra published his updated list of ten books for the working programmer. There are other such lists around, of course. Bill's list was the one that most recently crossed my eyesight and reminded me I've been meaning to publish my own list.

It's not completely clear what the rules of this game are, so I had to invent my own a little bit. I've stuck to Bill's choice to be language and platform neutral, as much as possible. Rather than try to pick ten books I think everybody should read, I wrote a list of books I currently use regularly and find useful in my day-to-day work. I've added a few near-finalists and some other interesting books at the end, too.

Any list like this is going to be swayed by the experiences and interests of the author. I realised my list is a bit more skewed towards process than practice than it might have been five years ago. This is partially a reflection of the fact that I've done a lot more project and team management in the last few years than at any previous point. All the books on this list are ones I either consult regularly as a reference, or try to re-read every now and again to keep the thoughts moving around in my head.

In no particular order:

First obvious difference with other "top 10" lists: mine only has eight items. Life's like that sometimes.

I'm surprised that Introduction to Algorithms doesn't get more words written about it. Sure, it's a pretty fundamental book. However, it includes a lot of the basic thoeretical underpinnings about the algorithms and implementation differences for algorithms that just isn't written down in other places. I don't use this book every day, but when I need the details of many algorithms, it's the first book I'll reach for. Still, this really is a fundamentals book, whilst the Algorithm Design Manual is more educational and thought provoking for real world, large information set problems.

I often work around problems that require security of various levels. Practical Cryptography is a bit like Introduction to Algorithms, albeit at a slightly more mathematical level, in that it gives a very solid theoretical grounding in the fundamentals of hashing and encrypting. It is futile to enter a discussion about the security of one approach over the other if you don't know this information and can't back it up with a reference to a book like this. Cryptography being a fast moving area of reasearch, a four year old book is going to show some dating by now, but it's still something I use regularly to back up my hunches or as a citation source.

Most of the others should be self-explanatory if you've read them. There appears to be some genuine controversy about whether Scott Berkun's book on project management is great or gross (see the comments on Bill's post, for example). I was surprised to see that my take was almost identical the thoughts Bill expressed in a comment — the Berkun book is very practical.

I'll just mention, too, that most lists like this include Steve McConnell's Code Complete (usually meaning the 2nd edition). I'm not a great fan of that book. It's a nice read and I have no argument with the content or approach. It's just not a book that I've found helped me a great deal. The McConnell book in my list above, Rapid Development is one I get more use out of as a way of translating between my brain and a more professional, standard way of presenting ideas. Using McConnell's approach and terminology eases the presentation to more formal project managers and decision makers.

There are some near misses. Mostly books that I have gotten a lot of education from, but no longer use on a regular basis because I feel I have absorbed their lessons. All of these books still sit on my shelves, though, and I would give them to versions of myself that were five or ten or 15 years less experienced (not all at once, some require more experience than others to be useful):

  • Master Regular Expression (Friedl)
  • Herding Cats: A Primer For Programmers Who Lead Programmers (Rainwater)
  • Career Programmer: Guerilla Tactics for an Imperfect World (Duncan)

The Friedl book on regular expressions makes a lot of peoples' lists, but I've never really struggled with regular expressions, so once I'd absorbed the lessons on optimisation and testing in different engine types, I found I wasn't going back to it too often. I recently re-read the latest edition and didn't feel I'd forgotten much. I may be weird in this way, though — I enjoy regular expression munging and use it a fair bit, so it stays fresh in my brain.

The other two books are of a much more practical, professional nature. As I worked in different organisations (or even the same organisation with revolving reporting charts), I needed to work a lot more on my pragmatism. I didn't (still don't to some extent) handle bad working conditions well when I'm trying to produce technical product, or manage other people to do the same. So this was an area I needed to put a lot of learning into over the past five years. These books would have been useless to a ten-year-younger version of me, but came along at the right time when I needed them.

Finally, some books that, whilst not indespensible, have been a great inspiration for learning more and thinking in different ways about my areas of expertise:

  • The Deadline (de Marco)
  • Game Programming Gems, Graphics Programming Gems
  • Mathematical Writing (Knuth)

De Marco's book is a great presentation — via fiction — about why project management is hard in the real world. Knuth's book on writing is special because it covers specifically technical writing about theoretical, logical work and focuses on presentation and differing approaches. Although about mathematics (obviously), which was how I first discovered it, a lot of the lessons transfer well to theoretical computer science presentations as well. Maybe not useful to the intensely practical programmer, but more than once I've had to prove that a program or approach worked and document that. The ... Programming Gems books are just a good source of short algorithm fragments and can make learning fun. If you can't have fun in this industry, you're just not reading the right books.

Syndicated 2007-03-26 21:29:38 from Malcolm Tredinnick

Django Tips: Variable Choice Lists

Been a while since I added to this series. I've come across a couple of repeated questions lately, so it's time to give back to the knowledge pool again.

This time: using iterators to customise the options presented via the choices attribute on a model field.


Before launching into the solution, let's consider the problem we are trying to solve. If you have a model field that is intended to hold only one of a number of limited values, Django provides the choice attribute. You can use it like so:

  class Document(models.Model):
    CHOICES = [(0, 'private'), (1, 'public')]
    status = models.IntegerField(choices=CHOICES)

When you use this in a form, only the two choices private and public will be presented and the database will store either 0 or 1, depending on the choice you made.

Aside: People often forget that when you retrieve such a model from the database, although the status field contains 0 or 1, you can get back the string version of the choice using the get_status_display() method of the model. Replace status with the name of the field for your own use. This is explained under get_FOO_display() in the Django documentation.

When Isn't This Enough?

There are two cases where the previous example falls a bit short.

The first case is when the list of choices is being updated regularly via changes to the database, or in some other way. In this situation, choices isn't the right approach to the problem. You are really talking about a dynamic relation to another data set. So model it that way: use a ForeignKey field to a table containing the list of choices and the values to store.

The second case is more subtle. Suppose you have a document presentation system. Documents on the production site are either public or private (more or less, this is the above example). However, the same code runs on a staging system as well, where documents are initially uploaded, reviewed and edited. On this system, the choices can include something along the lines of "ready for review" and "needs editing". This is a slight variation on similar systems I've implemented for a couple of clients recently, so it's not too unrealistic (although I've simplified a bunch of details).

In the second scenario, above, the list of choices is essentially static. So we are justified in using the choices attribute. However, the intiial values vary depending upon the system type — which we might reasonably control using a settings variable.

Now, it's generally a good idea to avoid referring to settings.* in the definition of fields and methods in Django. This way you can safely import the code without needing to have configured the settings module, which usually feels like neater code organisation (import everything, then configure, if you're using manual configuration). To my eye, using settings.FOO in declarations also looks a litle awkward (intuitively, it feels like a leaky abstraction, since we're delving into the depths of a module at the top-level).

For whatever reasons, whether you agree with me or not, I'm going to avoid using settings in my field declaration. Instead, I'm going to use a little-known (and not usually required) feature of the choices attribute: you can pass it a Python iterator instead of a sequence. So I can rewrite my example as follows:

  from django.db import models
from django.conf import settings

def status_choices():
    choice_list = [
            ('private', 'private'),
            ('public', 'public')]
    if hasattr(settings, 'STAGING') and settings.STAGING:
                ('review', 'ready for review'),
                ('edit', 'needs editing')])
    for choice in choice_list:
        yield choice

class Document(models.Model):
    status = models.CharField(maxlength=10, choices=status_choices())

You can see here that all the dependency on settings is inside the iterator function. So it isn't evaluated until Django needs to actually display the choices, which should be long after configuration has taken place. This relies partly on the fact that the Python compiler knows this is a generator function (because of the keyword yield) and consequently executes none of the code until the first value is retreived from the generator.

I would also draw attention to a couple of other implementation decisions I made in this code:

  1. The extra options only appear if the (optional) settings.STAGING setting is set to True. Note that this "fails safe", in the sense that if you forget to include the STAGING setting, it won't inadvertently expose the extra options and documents to the wider public. I made the setting optional, because I'm just a nice guy, and so had to first check that it existed using hasattr() before I tried to access it. You may or may not wish to be that flexible.
  2. I switched from storing integers, as in my first example, to storing short, readable strings in the database. I prefer this method, because it avoids the problems associated with having magic numbers in the database column. If you see the number '2' in the database, what does it mean? If you see the string 'review', things are a little more mnemonic. I've noticed a tendency for people to use integer values with the choices attribute; perhaps they are forgetting it works on pretty much any field and CharField fields are often a good choice?


If you are very familiar with Django, or tried to experiment a little with this example, you'll realise I have not told the entire truth here. The whole argument about using an iterator to avoid accessing the settings module too early is pointless. You cannot currently import django.db.models without configuring the settings module, so there's a chicken-and-egg problem there. However, I consider that to be a (very small) bug in Django and it's something I want to fix in the near future. You should be able to import modules without having done any configuration.

You probably won't need to use this technique very often at all. Every now and again, though, you will run across a configuration where being able to construct an intelligent choices list will help the code layout flow more smoothly.

Syndicated 2007-03-26 14:22:51 from Malcolm Tredinnick

After a long hiatus, I've started blogging again. Because I want to try out a few different things and not all of it is Open Source related, I've moved to my own site. Henceforth, all the real non-events will be over at the pointy stick.

GNOME Summit 2004

The summit was fun. Possibly surprisingly, this was my first dedicated GNOME conference and it was nice to finally attach faces to some people that I have been communicating with via email and IRC for three or four years (in some cases). It was also a productive three days for me in that I had a few conversations with people to discuss some things that would have taken dozens of emails to work through normally.

As noted already by so many other people, the Stata Center was quite a strange building, with some very odd ergonomic quirks. But it was a good venue for something like the GNOME summit in many ways.

So, good talks, good company, good fun. It was worth the trip over to the US to attend, I think. :)

Ankh: When I was starting to mess around with XML to the point that I wanted to read specs and write programs with it (early 2000, I guess), XLink seemed like one of the more interesting specs being produced. I could imagine a browser of some kind that would allow me to create annotated versions of documents that were otherwise read-only to me. I would just have to create a file of arcs between two external targets -- a point in the source document and the annotation document. I could write the annotation myself or just use this feature to connect up two different pages in a way that was useful to me.

Standard support for this sort of thing would be great. Instead of browsing document A that describes the connections between document B and other places, I could just go through B and have the links available as I go -- much more useful to somebody who works like I do, synthesising a lot between different information sources.

So much to write at some point; probably will never get it all down.

Just quickly, though: if there is anybody in the Sydney, Australia area who wants to work with Python on Linux doing "cool stuff", we are hiring. Get in quick to avoid the rush.

I have put up a brief write-up of my linux.conf.au week. It will only stay live for about a month, since it is not really GNOME related and that machine is a bit abused these days, so read fast.

(There may be some small issues with the photo captions on IE 5.x/Win, but the document is not important enough for me to bother tracking down some way to test that.)

17 Jan 2004 (updated 17 Jan 2004 at 12:54 UTC) »

And so another year's conference is over and we have to wait nearly 15 months until the next one.

Three days of interesting talks have gone past in a whirl. Every talk I went to was worth the time; speakers were well prepared, the equipment generally worked well and the rooms were comfortable enough, even when full. Havoc gave a fantastic keynote this morning about putting Linux on desktops everywhere. This is not to say Bdale and Maddogs' keynotes were not also interesting, but they said similar things to other talks of theirs that I have heard. I had not seen Havoc in full-flight advocacy mode before and it was an impressive performance.

I had a lot of conversations over the last three days of the conference with people who are interested in developing GNOME applications. One guy admitted to staying up until 0300 playing around with the Java bindings and discovering just how complete GNOME has become as a development platform. This kind of feedback is extremely rewarding and reflects well on everybody who is involved in GNOME development. Unfortunately, many of the conversations I had also highlighted the embarassing gaps in developer education materials we have, but that is kind of a known problem already. We just really, really need to fix it.

Extremely tired now, so tomorrow will be Recovery Day before returning to work on Monday.


Day two of the pre-conference program (yesterday) flew past for me. I gave a couple of talks at the Python mini-conference. The first one went fairly well (I have given it a couple of times now -- talking about using Python in business situations), the second one slightly less well, but that was entirely my fault; I could not think of enough practical Python tricks to talk about and it kind of meandered towards the end. Still, it seemed to be fairly well received and nobody threw fruit or anything. Wound up with a slightly impromptu talk at the GNOME mini-conference and then being roped into the question-and-answer session at the end.

Completed my speaking commitments today with a GNOME tutorial which attracted a reasonable audience who seemed to be interested enough to ask questions throughout. This is the kind of tutorial I think GNOME contributors should be trying to give at every single conference, so it was an interesting experience to see if I could talk about GNOME at a sufficiently high level to be interesting to "third-party" developers (Gstreamer guys: I pushed somebody in your direction who is interested in writing developer-level documentation. Don't frighten him away!)

I had a bit more time to talk to people today (and last night at the speakers' dinner). Much of my enjoyment at conferences like l.c.a comes from just catching with people I see once every year or two, so I am looking forward to just relaxing over the next three days, listening to talks and participating in the group discussions. I am completely blown away by the quality of organisation at this conference each year and the organisers this year have taken things to the next level again. Just little things like having areas set up under large tents so that we can sit out of the sun and talk makes the whole experience very enjoyable.


I am at linux.conf.au this week, along with a few hundred other people (glynn, jdub, hypatia, jamesh, mrd, havoc, ... the list of familiar faces is endless).

Gave my first (of four) talks (paper here) at the Linux and OSS in Government mini-conference this afternoon. I managed to avoid embarassing myself and some people stayed awake long enough to ask questions. One cannot ask for more.

The government mini-conference is really quite impressive. All day, it has been one speaker after another giving case study style presentations about the succesful use of Open Source software and ideas at both the state and federal level. Mine was the only talk all day that was not about a existing government installation or project (I was doing an advocacy talk). Normally, I suspect this kind of stuff would drive me up the wall, but the presentations have been very interesting and the between-talks talk (during coffee breaks) very motivating and intelligent.

91 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!