23 Sep 2003 Bram   » (Master)

Quasi-review: Test Driven Development

The introduction to Test Driven Development by Kent Beck starts with the following:

Early one Friday, the boss came to Ward Cunningham to introduce him to Peter, a prospective customer for WyCash, the bond portfolio management system the company was selling. Peter said, "I'm very impressed with the functionality I see. However, I notice you only handle U.S. dollar denominated bonds. I'm starting a new bond fund, and my strategy requires that I handle bonds in different currencies." The boss turned to Ward, "Well, can we do it?"

Anyone interested in domain knowledge would at this point have done some research on exchange rate volatility and hedging, critically important subjects for anyone competently managing a bond portfolio which dabbles in multiple denominations. But extreme programming dictates that you jump right in and start coding. Sure enough, half the book is dedicated to the writing of a Value class which completely obfuscates that exchange rates vary over time and that a commision has to be paid every time a currency exchange is done.

Trust Metrics

My claim in my last diary entry that the algorithm will always find all violating subsets isn't true. For example, it's possible that there are two disjoint islands of nodes, the smaller of which has proportionately slightly fewer spam markings and hence drifts slowly upwards. If there's a set of spammers within that island which drift downwards relawively to it more slowly than the island as a whole drifts upwards, they won't be detected. This isn't a big problem in practice, since it only lets a few extra spams through. Any group which goes significantly over the limit will quickly get detected.

The last metric I gave suffers from the following bad artifact: If A certs B, and B isn't certed by anyone else and B spams, then A gets knocked out completely as a result as well as B. This can be fixed by differentiating between removed nodes which are directly certed by a non-removed node and removed nodes which are only certed indirectly, by other removed nodes. Indirectly certed nodes are removed, while directly certed nodes are only removed if they themselves spam more than the threshold. This makes it so that if you don't spam yourself but cert a bunch of dubious people the worse that could happen is that your certs stop working. It also makes it possible for spammers to send twice as much spam, first indirectly and then directly, but that's no big deal, since we have several orders of magnitude to play with before spamming becomes worthwile.

The technique I gave allows a certain amount of spam per cert. Changing it to an amount per node rather than per cert may have significant advantages, both in terms of behavior and computation time, but I have to think about it more.


Matt Brubeck informs me that the GUI API I described is essentially what BeOS did.

So BeOS had a good GUI system in addition to real time features. I don't know what criteria Apple used when it decided to purchase Next instead of Be, but technical merit clearly was not the overriding one.


The automatic insertion of <p> tags on advogato is too smart for its own good. The only coherent way to do it is to add <br> before every carriage return (assuming one isn't already there) and skip <p> tags completely.

Of course, advogato was just trying to do the 'right thing'. Supposedly the semantics of <p> are deeply meaningful and important. Some of us can even remember back in the day when anyone who put <br><br><br> in their html would get flamed for it. These days everyone can see that such flamage was stupid. The truth is that as soon as mosaic (the first good GUI web browser) came out it was completely obvious that html was all about layout. <p>'s semantics have been nothing short of a disaster. The many differences between browsers caused by implicit insertions of <p> in different places have caused no end of headaches.

The w3c, for its part, continues to claim that html is all about markup, and denies that not making <p> be shorthand for <br><br> is anything other than asinine.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!