Now that we have a gorgeous cover and a press release it all feels very real.
The book's called: Software Craftsmanship: From Apprentice to Journeyman and it's going to be in the same series as books like: Beautiful Code, Prefactoring and Practical Development Environments.
In the past couple of months I've been to 3 events that exemplified their communities. I'll describe them in decreasing order of fun.
BarcampLondon3 took place in the Google London offices. It was dynamic, self-organising and full of strangers. Everywhere I looked people were introducing themselves to each other. The sessions tended to be less than polished. In fact in any given session there would be a person writing their presentation for the next session. Despite, or perhaps because of, this the sessions tended to be filled with information that someone passionately wanted to share.
The people tended to be younger and more diverse (in every dimension) than the Agile crowd. For example Ian Forrester's presence meant that I'm not "the black guy." Practitioners like Matt Biddulph showed up and talked about the technologies being used by their startups. Developers from the BBC demonstrated unreleased systems and everywhere I looked I saw people having conversations about advancing the state of their art. It was invigorating.
XPDay 2007 was full of workshops, consultants and old colleagues. A lot of the sessions seemed to involve a presenter with the germ of an idea who expected the audience to flesh out the idea. This sort of 'stone soup' sessions can be a lot of fun but after a while they become monotonous. You start to long for a presenter with an agenda, an idea to share or even an axe to grind.
David Stoughton's talk was the only one that delivered. David (and his co-presenter whose name I've sadly forgotten) presented difficult ideas that generated debate and principled disagreement amongst the audience. I'm not sure I agreed with all the ideas they raised about brands, projects and value but they challenged my existing beliefs. That in itself is valuable.
The Agile community, to its detriment, has focussed on helping laggards improve rather than advancing the state of its art. This has resulted in boredom for experienced agile practitioners. Apart from networking, meeting potential clients and catching up with old friends in the pub there's precious little to do or learn at these conferences. It's just the same old people teaching the same old lessons.
I think this happened because there was a dearth of people from other communities at the conference. Glazier's Hall was an island populated by consultants and people who work in investment banks. As someone who used to be both I should have fit right in. However I spent large parts of the conference feeling disconnected from everybody else's preoccupation with enterprise IT. It seemed hard for people to adjust to the idea that there's a whole world of people building systems that aren't foisted upon a captive user base.
There was a presentation from a team at Yahoo Europe but I felt that they didn't focus enough on the lessons that other people could take from their experience of adopting Scrum. The key insight I took away from the first day of the conference was from Jeff Patton's talk. He enabled me to see the connection between the roles of the agile Customer/Product Owner and that of a Product Manager by showing that Product Managers are really just 'professional' Customers.
Apart from that the conference only reinforced my sense that Agile seems to be approaching senescence. We've forgotten our origins in lightweight methodologies focussed on finding ways to perform our jobs better. We've turned inwards and are re-iterating the same old ideas when we should be looking to see if other people have harder problems, newer challenges and different lessons to teach us.
I'd like next year's XPDay to reach out to a wider group of people and be more open to new ideas. This isn't just a matter of embracing the technologies that other conferences are using. We have to go further and try risky things like supporting an explicit 'hallway track' by having shorter sessions with staggered start times so that people have more time to mingle between sessions. I'd also like to see a BarCamp style track where random attendees could run sessions about whatever caught their fancy. This would finally give us a conference that could respond to feedback more frequently than once a year. I would also like to see XPDay inviting more 'strangers'. People like Dave Snowden, David Stoughton or even Tim O'Reilly could bring an infusion of ideas from outside the enterprise IT/consulting/academic circles that would really enliven this conference.
A little while later I went to the Royal Institution for a talk on Machine Learning by Professor Chris Bishop of Microsoft Research. In its heyday the RI used to host talks by Faraday and the leading scientists of the 19th century. Unfortunately the present administrators have confused the accidental and essential aspects of the Faraday era. So they've replicated all the features of Faraday's lectures right down to the servants holding the doors open for the speaker, the gong that indicates that the allotted hour of lecture time is finished and of course the "no questions" rule.
Consequently the middle-aged crowd of bourgeoisie spent 60 minutes being taught about Bayes' theorem, where Bishop was brilliant, and being bewildered by demos of fancy image editing features that Microsoft Research are working on. Since we couldn't ask questions we had to sit there in our black-tie outfits and hope that the speaker would pick up on our bafflement in time to tie all the threads of his presentation together. He didn't.
I left the session wondering why anyone would attend this kind of presentation in person if there's no interaction with the presenter. If I'm going to sit numbly while someone else pontificates then I might as well be at home watching TV. Sadly the RI doesn't really seem to solicit feedback so there's not much I can do about it except be disappointed.
The video from his Tech Talk on Lessons Learned From Advogato is now available. It's a candid look at the issues that affect any community that tries to use trust metrics to facilitate large scale collaboration.
Aggrevator is finally available (in beta). According to my LaptopWiki I started work on this in July 2003 because I wanted a web-application based aggregator to replace Flock. Since then I've played around with implementing the essential idea in technologies ranging from Python to Java. And I finally settled on a rich client application written in Java. It uses SWT (the same library as Eclipse) for the GUI and MySql for the database back-end.
There really ought to be a name for the process of searching for a license that is compatible with the licenses of all the libraries that you're using. I ended up going with GPL mainly because the MySql JDBC drivers use that license and all my other libraries are either LGPL, BSD or Apache.
It's good to see that Advogato is back up. Although I'd stopped checking to see if it was still up and I only found out because Google sent me a web-alert after re-indexing my diary here.
So what have I been up to since the last time? Well I switched employers, met mwh at this year's UK Python conference and I'm doing a lot less travelling.
In relation to things open source: I never did finish that OsCache guide. On the bright side I've made significant progress on Harvester: an RSS aggregator. Of course everybody and their dog has an aggregator nowadays but mine is different because it's based on storing everything (for offline reading, later searching, analysis, etc) in a MySql database and using scoring to order feeds.
Like everybody else I tried using an approach based on Bayes' Theorem but swiftly ran into a problem where to be able to rank all entries by their classification (how interesting are they based on your previously expressed preferences?) you need to classify every entry. What's more every time the user expresses an interest we need to re-classify every entry. Unfortunately I'm testing my aggregator with about 853 feeds containing 55,570 entries for the last 2 months. The need to re-classify because I'm showing the relationship between all these entries rather than a binary spam/ham distinction pretty much rules out anything similar to Classifier4J. Pity really as it's a nice little library.
In other news I've rejoined the church of emacs. Even if it's only for looking at logfiles.
Still struggling to write that guide to OsCache. Hopefully I won't still be saying the same thing this time next year.
On the bright side I managed to write up a nice little visualisation script in Ruby. It basically takes your struts-config.xml file and turns it into a graph and spits out a .png image of that graph. It's available here and an example of the output is available here.I wrote it mostly to help me learn Ruby but now it's up on my site I realised that I seem to always end up playing with visualisation tools of one kind or another. I think I'll get myself one of Edward Tufte's books for Christmas.
Apropos of nothing I'm reminded of a C++/Program design exam at university where I found a bug in one of the questions. I'm still amused by the note I wrote. It was something along the lines of: if this did what you think it does the correct answer would be foo but due to this bug it actually does bar. I'll probably never know if I got any marks for that question. C'est la vie.
TDD (Test Driven Development) has a lot of benefits (it really helped me in writing strutsviz despite my almost non-existent Ruby knowledge) but it's blooming hard. Mainly because it's an indirect design process. I've tended to find that the first version of anything I write with TDD passes the tests but feels awkward, inelegant or contains duplication. At which point I usually re-realise that TDD requires iteration and refactoring as well. Of course the tests mean that you can transform the code incrementally without worrying too much about breaking things.
And sometimes on a good day all this comes together nicely. The code approaches ExtremeNormalForm and I find that the addition of extra functionality is actually shrinking the number of lines of code.
vivekv: I think I have a clearer idea what's going on with AdvogatoPoster. Firstly we have to accept that anything to do with Date, Calendar or TimeZone in the standard Java libraries will be a mess. Then everything makes sense.
The essential problem seems to be how do we compare dates in two different timezones? We can't use plain unix time because the epoch was a different amount of time ago depending on which time zone you're in. Therefore we pick a timezone and do all comparisons in there. Since Advogato is based in Berkeley (or at least that's what traceroute says) then we just do all date comparisons in Berkeley time.
private static long convertToBerkeleyTime(Date date) {
Calendar local = new GregorianCalendar();
local.setTime(date);
Calendar la = new GregorianCalendar(TimeZone.getTimeZone("America/Los_Angeles"));
la.clear();
la.set(Calendar.YEAR, local.get(Calendar.YEAR));
la.set(Calendar.MONTH, local.get(Calendar.MONTH));
la.set(Calendar.DAY_OF_MONTH, local.get(Calendar.DAY_OF_MONTH));
la.set(Calendar.HOUR_OF_DAY, local.get(Calendar.HOUR_OF_DAY));
la.set(Calendar.MINUTE, local.get(Calendar.MINUTE));
la.set(Calendar.SECOND, local.get(Calendar.SECOND));
la.set(Calendar.MILLISECOND, local.get(Calendar.MILLISECOND));
return la.getTimeInMillis();
}
The above code gives us a long which can be meaningfully compared to the timestamp on a local file.
13 May 2003 (updated 18 May 2003 at 16:33 UTC) »
vivekv: I had a look at the code to AdvogatoPoster and the XML-RPC client you're using is using the DateFormat's parse method to create your Date object. This assumes that a date created locally should participate in the local timezone.
Your code will have to call setTime(Date date) on a Calendar instance and then make sure that Calendar is pointing to whatever timezone Advogato is in. Then you can use the time fields in that Calendar object to do your file processing. Details are here
It says something about the current drift towards complexity in the Python language (cf proposals for a ternary operator and do-while loops) that I almost believed that this was a real PEP. I can only hope that instead we'll see more changes like the new csv module or type unification which make life simpler.
Even though I still prefer Python (mostly due to the maturity, quality and size of the libraries) I think most Python developers would benefit from taking a look at Ruby and identifying ideas that are worth borrowing.
On another note. I've been adding caching to KwikWiki using < a href="http://www.opensymphony.com/oscache/">OsCache</a>. It works very well but learning to use it requires reading it's source code, scanning Google and diving through the project's mailing list archives. I hope to have a guide to using OsCache in Model 2 architectures with servlet filters up on my web site soon so that others won't have to learn the hard way.
One thing that I've begun to be interested in again is identifying a canon of programs that are worth reading for their educational value. As part of my day job I'm in the habit of using Kwikwiki to introduce people to certain ideas about object orientation and design patterns but I wish I had more examples to point people to. For now this wiki page will have to suffice.
Whilst waiting for my plane this evening I realised that Advogato already possesses sufficient data to be used as a reputation space. The certifications and the diary ratings together comprise a 2-d reputation space which can be charted.
The basic algorithm looks something like this:
#assume we have a 2-d array of points for each person in personList: r = person.rating c = person.cert points[r][c] = person
Once you have that then finding out who is in your reputation neighbourhood/cluster is trivial. Of course all this proves is that you can massage multiple independent reputation systems together to create a 'space'. But with other inputs it does show how this could be used for finding like-minded people or community-formation or collaborative filtering. The real problems start to show up when you're trying to visualise n-dimensional reputation spaces and provide a usable interface.
A basic implementation in Python can be found here: http://www.oshineye.com/software/advospace.html and an example of it's output using myself as the root for the diary ratings can be found here: http://www.oshineye.com/software/advoSpaceOutput.html
Searching for Lenny Foner and his (seemingly dead) Yenta project on CiteSeer has generated a large stack of pdfs that I shall be going through sooner or later. Most of the interesting ones are using agents or similar notions to avoid centralising all the information for the trust metric in one place. It also gets arounds a few of the scaling issues as well.
FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!