Older blog entries for ade (starting at number 4)

Emergence and reputations:
A couple of weeks ago Shaun Smith asked me "what can I use trust metrics for?" I didn't really have an answer. I still don't.

In reading Emergence I've come to believe that the ideas behind trust metrics, reputation systems, reputation spaces (more about this one of these days), blog trees/rolls and FOAF networks are all intrinsically related. Out of that morass we ought to be able derive systems that will enable people to find clusters of like-minded or interesting individuals. In the same way Google can find pages related to certain keywords we should be able to use a system to find related people.

The tricky bit is the implementation.

Random musings:
It's ironic that raph is asking bytesplit to leave. I had always considered the appearance of someone like bytesplit to be precisely the kinds of 'attack' that Advogato's trust metric was designed to protect against. So why isn't it working?

Firstly people aren't giving out certifications according to the guidelines and secondly the existing trust metric doesn't offer a way to 'de-privilege' someone till they're invisible to you. The diary rating system is a good first step towards that goal.

danf: I had considered the idea of multi-dimensional networks but rejected them in favour of an approach which treats the results of various parallel (and potentially unrelated) trust networks/metrics as a set of co-ordinates in a reputation space (the spatial metaphor enables you to consider the effects of time on ratings and plot an entity's path through space to see if it's similar to others). Reducing an entity's rating in n different networks to one value is tempting but not as useful as a tuple of ratings with a canonical ordering. This way every network that represents a dimension with an assigned number. These tuples can be used to map the locations of other entities. Then a user can define those regions in this reputation space that they value. For instance I might value the cluster of entities that have some rating in the Advogato dimension, a high rating in the Python dimension, a rating in the J2EE blog dimension and a greater than 61% rating in the Java Certification dimension.

Yenta by one of Patty "Firefly" Maes's graduate students named Lenny Foner is a similar idea but with actual downloadable code rather than vapour. Funnily enough I didn't actually find out about Yenta till I started searching for a decent link to Firefly to put this diary entry in context.

7 Oct 2002 (updated 8 Oct 2002 at 00:40 UTC) »

Got 4 out of 9 on the GPL quiz at http://www.gnu.org/cgi-bin/license-quiz.cgi. Just goes to show that the only kind of lawyer anyone would ever mistake me for, is a language lawyer.

Does anyone know where it's possible to get the Mandrake 9 ISO images via BitTorrent?

Random musings: Supposing I'm a member in various groups all using trust metrics then how would I tie all these together so that it's possible to easily express situations where: A knows B, A rates B as Master on Tech-Advogato but A rates B as Apprentice on Poetry-Advogato? Do we collapse the whole world into one big network with multiple arcs between any two nodes or do we have lots of separate networks?
Speaking of which I wonder what the intersection of the set of people that frequent the original wiki and have Advogato accounts looks like?

Ploughing through large amounts of the second edition of Friedl's Mastering Regular Expressions has led me to realise that ever-more sophisticated regular expressions won't solve the mark-up conversion issues I have with KwikWiki. It would be easy to blame it on the lack of features like nested regular expressions in the standard java.util.regex package but I think that the problem lies in the idea of a purely textual approach to processing markup. I need to come up with a different approach that respects the inherent (linguistic?) structure in wiki pages.
Of course the most troublesome area has to be the area with the fewest unit tests.

I've been aggressively refactoring the way special pages work in KwikWiki. The current code is rather embarassing as it hard-codes the names of the special pages in a chain of if...elses. I've used a technique that I also used in writing the Emacs keybindings in Moleskine. Namely a hashmap from some string to a function. Except that in Java you can't just have a dictionary literal with anonymous functions. You have to go to all the trouble of setting up an interface so that you can then use anonymous inner classes which all extend the same interface.

The annoying thing is that I've been claiming on Wiki lately that Java doesn't need blocks and now I realise that blocks or rather first class functions can simplify a large class of problems. For example most implementations of the Command pattern would be much better off using a hashmap that contains functions rather than the scaffolding that OO languages require. Henry Baker made the same point a long time ago. A lot of design patterns are really nothing more than a codification of work-arounds for the deficiencies of a particular OO language. Object-functional languages remove the need for many of these patterns. This raises the question: do we really need any pure OO languages or should they all be object-functional?

am:: The most embarassing thing about having what you think are decent unit tests is when a bug report comes in that makes you realise you missed something pretty obvious. Anyone who has downloaded the 2002-07-04 build of KwikWiki will have noticed that it can't create new pages. Oops.

Mamading Ceesay sent me the bug report, complete with stacktrace. I follow the stacktrace and get to a line of code with a comment that says "assume file always exists." But files that haven't been created can't be assumed to exist.

It turns out that whilst I was fixing a bug in the reverse file index code (words weren't being removed from the index when were removed from pages) I injected this bug in the page creation code by making it try to open the pre-existing version of every page it tries to save. Mea culpa.

It's a fairly easy bug to fix but I'm trying to adhere fairly closely to XP principles in all my projects. So I now have to write a unit test that exposes it. Then and only then can I fix the bug.

In other news
I really need to get HttpUnit working as part of my build. The alternative to running tests in the app server would be to decouple more of my code from servlet technology so that they can be tested in isolation.

Barring extreme laziness I shall be going to the XP Tuesday Club this evening. A whole bunch of people from Thoughtworks are coming over so it ought to be interesting.

pm (or rather middle of the night/early morning next day):: The trivial bug turned out to be more trouble than I'd anticipated. Luckily the test I'd written failed with my quick fix. So I didn't have the embarassment of annoucing it was fixed and then saying sorry when I realised the problem was more subtle than I'd anticipated. In the end I wrote a proper solution on the tube whilst I was heading to the XP Tuesday Club.

I bumped into Martin Fowler and the rest of the Thoughtworks group on the way to XTC. They, and everyone else at XTC, were very friendly. Especially considering that this was my first visit.

It was very unstructured and laid back but I enjoyed it immensely. I shall be back.

There were lots of very interesting people there including, Duncan, the guy behind the Kew programming language. When I've had time to take a deeper look I shall probably post a link to Lambda.

And of course I put out a new release of KwikWiki.

10 Jul 2002 (updated 26 Nov 2009 at 15:09 UTC) »

Moving to Subversion from CVS is a bit like taking a taxi through Amsterdam when it's raining. You plough through puddles at 80kph and generally have a good time. However if other people are involved and you're actually trying to get some work done then things can get quite messy.

Luckily I'm only moving my personal projects over. On more than one occasion I've questioned my reasons for making the transition, especially this early in Subversion's lifetime. On the other hand it seemed like fun and CVS's few flaws can be quite annoying. Renaming a file without losing it's history doesn't seem that important until you need it. After reading Martin Fowler's book, Refactoring, and applying the lessons I've learned from it I've come to appreciate the value of developing code in a more disciplined way. This involves automated unit testing and constant refactoring (to eliminate duplication and reduce the build up of cruft) amongst many other practices which you can find in the Extreme Programming books. But if one is constantly refactoring then one needs a source control system that can track an artefact even when it's name changes or it moves from one directory to another.

Which is why I started learning Subversion in the first place.

I've bypassed the problem of disconnected operation by simply getting in the habit of moving my svn repository from laptop to desktop and back again whenever I'm on the road. It's not particularly elegant but it works. The main CVS feature I miss is the ability to browse the repository. Or at least be able to get a listing of it's contents.

And in other news mySql turns out to be closer to the SQL standard than MS Access. Porting a small Java web application from one to the other was relatively painless. In the end I didn't need the classes implementing the Factory pattern which were meant to hide differences between the two. The only real problem was Access's weird syntax for the LIKE operator.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!