Older blog entries for dwragg (starting at number 10)

Random Lispings

IoC containers, Part 2 soon. But in the meantime, some Lisp links.

Whatever happened to Henry Baker, I wonder.

IoC containers, Part 1

Mike Spille wrote an interesting essay about Inversion of Control containers, (with a followup here). He gives a checklist of desirable features in these containers, and surveys HiveMind, Spring, and PicoContainer. PicoContainer comes out worst, partly because it lacks some of the features from his checklist (and in fact, these omissions are deliberate on the part of the creators of PicoContainer).

I have a particular interest in IoC containers, because I developed one at work, and continue to maintain it and enhance it today. I call it the init framework (colleagues often call it components.xml, because its per-module configuration files have that name). Here, I'll just call it TIF.

In terms of generalities, TIF has a lot in common with the well-known IoC containers. It shares many ideas and some specific techniques, and the basic benefits are similar. But these similarities represent a case of convergent evolution. I began work on TIF a little over 2 years ago (the CVS logs say the first check-in was on 2002-07-01; the first lines of code must have been written a few days earlier). If those other IoC containers existed at that time, I wasn't aware of them.

(From the public CVS repositories, it seems that HiveMind originates in May 2003, PicoContainer from mid-2003, and Spring from August 2003, but it is possible that those dates are misleading: the projects may have been hosted elsewhere before moving to their current repositories. In any case, some of these ideas may well have been around earlier; I seem to remember being aware of Apache Avalon, though it was certainly not something I had any desire to imitate.)

So I find it interesting to compare TIF with the open source ones, and to consider why I made the design choices I did and why the authers of the other containers made similar or different choices. Since Mike's essay it perhaps the nearest thing I have seen to a survey of the IoC "market", it provides useful context for these comparisons.

In summary, here is how the comparison comes out:

  • Like HiveMind, TIF has a strong emphasis on XML-based configuration.

  • Like HiveMind and Spring, TIF has strong support for resolving dependencies both by name and by type.

  • Like PicoContainer, TIF supports only constructor-based injection, not setter-based injection. However, there is a twist, which I believe achieves many of the benefits of setter-based injection without raising the difficult questions of that approach.

  • Like PicoContainer, TIF des not support cyclic dependencies. It did support them at one point, but that support was removed, and I have no inclination to put it back.

These last two points stand out, because Mike's article, and the subsequent discussion, criticized PicoContainer for missing precisely these features. Mike's position is that these are examples where the creators of PicoContainer put their purist principles ahead of the pragmatic needs of developers who might use their container. I can't speak for the creators of PicoContainer, but my decisions had very pragmatic reasons behind them: If my container doesn't meet the needs of the real-world projects I work on, my colleagues simply won't use it!

In subsequent posts, I will consider each of these features in turn, and explain how TIF works the way it does. Then I will write another post to describe notable features of TIF not shared by other containers.

(These posts are partly a response to colleagues who have asked me why I wrote TIF when there are similar open-souce projects around, and why it doesn't support cyclical dependencies. I expect someone will eventually ask me why it doesn't support setter-based injection, and I will be able to point them here.)

NetKernel looks like an interesting project. But I don't think it helps itself by describing itself as a "microkernel", rather than what it is: a REST-oriented Java framework for implementing services (as this TheServerSide.com article makes clear).

The term microkernel has a specific meaning, from operating systems research, and that meaning does not seem to be relevant to NetKernel in any way. Their use of the term kernel also seems dubious, but at least that is a general term used in different ways in different parts of the software world, so they can more reasonably appropriate it to apply to their project. (Maybe all this would seem less silly if microkernels, in the true sense of the word, were more widely seen as a good thing.)

I'd like to dismiss this as the usual marketing stupidity, but the misuse of the term continues even in the technical documentation, which claims:

We have deliberately omitted technical marketing documents which present business and/or technical analysis for choosing NetKernel. If you require technical marketing literature for NetKernel please visit the 1060research.com and 1060.org sites.

You also need to dig through their web-sites and documentation to discover that, while the concept may not be specific to Java, the current implementation certainly is . But surely this is important informantion for any potential customers? My guess is that they want to avoid the question of why they don't implement apparently-relevant standards from the Java world such as the Servlets API. I hope that the answer is that they are doing innovative things that don't fit inside those standards. It would be nice to see this explained properly.

1 Aug 2004 (updated 1 Aug 2004 at 22:25 UTC) »

A couple of years ago, I read John Taylor Gatto's essay The Six-Lesson Schoolteacher (which happens to be on ncm's site). Now an entire book by the same author can be read online: The Underground History of American Education. Interesting reading.

Company develops HTTP proxy that support persistent connections.

At least that is what it reads like. I haven't been able to find anything so silly on the actual NetScaler site, though there is not much technical detail there either.

23 May 2004 (updated 7 Jul 2004 at 17:26 UTC) »
An interesting essay about unit testing.

I rarely see much discussion of unit tests that acknowledges that they may not be appropriate in some contexts. But the cost/benefit function for unit tests is by no means trivial:

  • For many kinds of code, building worthwhile unit tests is extremely difficult. In my experience, the sophistication of the unit test code matches the sophistication of the code it tests. For code with a simple synchronous interface, and where the outputs are a simple deterministic function of the inputs, unit testing will be easy. When these conditions don't apply (that is, for code that is hard to get right), unit testing can become much more difficult.

    Even for "hard" code, unit testing is still possible, but the design must respect unit-testability. This can greatly increase the amount of work involved over the non-unit-tested case. If this level of difficulty is underestimated, the tests may end up ensuring very little, or being fragile, or creating an obstacle to maintainance and further development of the code covered by the tests.

  • In a large OO program, much of the behaviour arises from the interactions between objects. Unit testing strives to test individual classes, and so does not help to find problems in these interactions. At best, it can ensure that the objects implement their sides of the relevant contracts correctly. But that may not mean very much in a design where groups of objects achieve more than the sum of there parts.

Because of issues such as these, unit testing might best be understood as a way of catching some bugs earlier, rather than as a way to reduce overall defect rates. Thus it can make the development process more predictable.

But comprehensive unit testing has high costs, and other forms of testing may be more appropriate for some projects, or some parts of some projects. It would be nice if there was less cheerleading of unit testing, and more analyis of the pros and the cons.

23 Apr 2004 (updated 30 Nov 2004 at 19:31 UTC) »

I'm trying to work out if there is anything interesting in SOA. I'm hopeful that there is, but I haven't found anything written about SOA that captures it, despite the fact that a lot of clever people are writing about SOA. Of course, much of this verbiage is due to to the SOA hype machine. But once you discard the hype, what is left?

If there is any merit to Web Services, it must be in SOA, since it is certainly not in the concrete realization of SOA as XML Web Services. Each layer is flawed in various ways. XML may be a nice document interchange format, but it is a strange choice as a message serialization format for communications protocols (even if you focus on document/literal style web services). SOAP, XML Schema and WSDL all have reasonable ideas at their cores, and can all be made to work, but none of them fit their niches perfectly, and none of them are a model of simplicity and elegance.

So where is the merit in SOA?

The most frequently cited statement of the essence of SOA seems to be Don Box's four tenets:

  1. Boundaries are explicit.
  2. Services are autonomous.
  3. Services share schema and contract, not class.
  4. Service compatibility is determined based on policy.

The first two tenets seem to me to be a matter of emphasis, rather than innovations. These principles can be used to advantage in conjunction with CORBA or COM; conversely, there is nothing about Web Services that requires these principles to be observed (though some Web Services toolkits make boundaries explicit simply by having an excessively clunky programming model).

The third tenet seems like the weakest of the four. CORBA and COM do not share programming language classes; they share contracts defined in their IDLs. The relevant implementations map contracts specified using those IDLs onto machine-centric bindings and representations, rather than the textual XML representations used by Web Services. But there is little in those IDLs that mandates such a particular mapping. In any case, the debate does not seem to hinge on the superiority of XML as a data format compared to more machine-centric formats (perhaps becuase the XML generated by web-services toolkits is often little more human-readable than machine-centric representations). So my guess is that this tenet has its origin as a reaction to something else:

  • Java RMI, where the Java programming language also serves as the IDL, and objects can be passed across remote interfaces with few restrictions.

  • COM, which was often seen as simply a mechanism for inter-programming-languange calls. This has left it somewhat tainted as a distributed programming technology, despite the existence of DCOM.

My conclusion is that if the first three tenets were all there is to SOA, then CORBA and DCOM would be perefectly adequate technologies for implementing SOA.

Which leaves "Service compatibility is determined based on policy". I have not yet come to any conclusion about how strong this tenet is. I suspect that there is something there, but the examples offered by WS-Policy and the related specifications are not very convincing.

14 Apr 2004 (updated 6 Jan 2005 at 12:27 UTC) »
Martin Fowler writes about published interfaces.

This struck a chord. In my day job, I maintain a large Java code base, which is used from several other projects. Mostly those other projects use the "published" interfaces, but sometimes they try to use classes and interfaces that I regard as internal (although they are declared as public, since they are internal but widely used within my code base). Eventually I will have to document the list of published classes and interfaces. I will start by making a list of classes that are already used externally. A while ago, I wrote a Perl script that analyzes dependencies between .class files, and it shouldn't be too hard to adapt it for this purpose.

When I wrote the first entry to this diary, I planned to follow it up reasonably regularly. I didn't, but that is partly because I spent most of December on vacation. The new year seems like a good time to try again.

So to begin with, a few impressions gathered during that vacation.

  • For two weeks, I played at being a tourist in London and England, the city and country I lived in for most my life, doing all the tourist highlights. England really is an expensive place to be a tourist, but overall seemed decent value for money.

  • I drove a lot. I hadn't driven for almost 2 years, and wondered how I would find it. It was fun. The roads were clear, the weather was good, and I was driving through some pleasant places, which probably had a lot to do with it being fun.

  • I spent Christmas with my parents in London. I wasn't there last year, and missed it greatly. There is something strangely magical about watching TV with a full stomach and a decorated piece of tree in the room.

  • I returned to Moscow, where I now live, shortly before the new year. Three weeks away was enough to restore the sense of foreignness I felt when I first came here, the feeling was short lived.

slamb has a point.

In my previous article, I was referring to debuggers as tools for single stepping a running program, inspecting data, setting breakpints and watchpoints, etc. When programming in C and C++, I did use debuggers as slamb describes: for obtaining stack traces upon segfaults. But in the last couple of years, the majority of my development has been in Java, where exception stack traces make use of a debugger for this purpose unnecessary. And before that, when I was mostly programming in C, I implemented a simple stack trace facility, a bit like the glibc libSegFault or the Linux kernel Oops reports, and used it in a few projects (multithreaded programs, since at that time on Linux, threads and gdb did not mix well, and multithreaded core files did not work at all).

So I was referring to debuggers in a particular sense, but in my defence, I think the same sense was used in the articles which I linked to.

1 older entry...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!