Older blog entries for dwragg (starting at number 5)

23 May 2004 (updated 7 Jul 2004 at 17:26 UTC) »
An interesting essay about unit testing.

I rarely see much discussion of unit tests that acknowledges that they may not be appropriate in some contexts. But the cost/benefit function for unit tests is by no means trivial:

  • For many kinds of code, building worthwhile unit tests is extremely difficult. In my experience, the sophistication of the unit test code matches the sophistication of the code it tests. For code with a simple synchronous interface, and where the outputs are a simple deterministic function of the inputs, unit testing will be easy. When these conditions don't apply (that is, for code that is hard to get right), unit testing can become much more difficult.

    Even for "hard" code, unit testing is still possible, but the design must respect unit-testability. This can greatly increase the amount of work involved over the non-unit-tested case. If this level of difficulty is underestimated, the tests may end up ensuring very little, or being fragile, or creating an obstacle to maintainance and further development of the code covered by the tests.

  • In a large OO program, much of the behaviour arises from the interactions between objects. Unit testing strives to test individual classes, and so does not help to find problems in these interactions. At best, it can ensure that the objects implement their sides of the relevant contracts correctly. But that may not mean very much in a design where groups of objects achieve more than the sum of there parts.

Because of issues such as these, unit testing might best be understood as a way of catching some bugs earlier, rather than as a way to reduce overall defect rates. Thus it can make the development process more predictable.

But comprehensive unit testing has high costs, and other forms of testing may be more appropriate for some projects, or some parts of some projects. It would be nice if there was less cheerleading of unit testing, and more analyis of the pros and the cons.

23 Apr 2004 (updated 30 Nov 2004 at 19:31 UTC) »

I'm trying to work out if there is anything interesting in SOA. I'm hopeful that there is, but I haven't found anything written about SOA that captures it, despite the fact that a lot of clever people are writing about SOA. Of course, much of this verbiage is due to to the SOA hype machine. But once you discard the hype, what is left?

If there is any merit to Web Services, it must be in SOA, since it is certainly not in the concrete realization of SOA as XML Web Services. Each layer is flawed in various ways. XML may be a nice document interchange format, but it is a strange choice as a message serialization format for communications protocols (even if you focus on document/literal style web services). SOAP, XML Schema and WSDL all have reasonable ideas at their cores, and can all be made to work, but none of them fit their niches perfectly, and none of them are a model of simplicity and elegance.

So where is the merit in SOA?

The most frequently cited statement of the essence of SOA seems to be Don Box's four tenets:

  1. Boundaries are explicit.
  2. Services are autonomous.
  3. Services share schema and contract, not class.
  4. Service compatibility is determined based on policy.

The first two tenets seem to me to be a matter of emphasis, rather than innovations. These principles can be used to advantage in conjunction with CORBA or COM; conversely, there is nothing about Web Services that requires these principles to be observed (though some Web Services toolkits make boundaries explicit simply by having an excessively clunky programming model).

The third tenet seems like the weakest of the four. CORBA and COM do not share programming language classes; they share contracts defined in their IDLs. The relevant implementations map contracts specified using those IDLs onto machine-centric bindings and representations, rather than the textual XML representations used by Web Services. But there is little in those IDLs that mandates such a particular mapping. In any case, the debate does not seem to hinge on the superiority of XML as a data format compared to more machine-centric formats (perhaps becuase the XML generated by web-services toolkits is often little more human-readable than machine-centric representations). So my guess is that this tenet has its origin as a reaction to something else:

  • Java RMI, where the Java programming language also serves as the IDL, and objects can be passed across remote interfaces with few restrictions.

  • COM, which was often seen as simply a mechanism for inter-programming-languange calls. This has left it somewhat tainted as a distributed programming technology, despite the existence of DCOM.

My conclusion is that if the first three tenets were all there is to SOA, then CORBA and DCOM would be perefectly adequate technologies for implementing SOA.

Which leaves "Service compatibility is determined based on policy". I have not yet come to any conclusion about how strong this tenet is. I suspect that there is something there, but the examples offered by WS-Policy and the related specifications are not very convincing.

14 Apr 2004 (updated 6 Jan 2005 at 12:27 UTC) »
Martin Fowler writes about published interfaces.

This struck a chord. In my day job, I maintain a large Java code base, which is used from several other projects. Mostly those other projects use the "published" interfaces, but sometimes they try to use classes and interfaces that I regard as internal (although they are declared as public, since they are internal but widely used within my code base). Eventually I will have to document the list of published classes and interfaces. I will start by making a list of classes that are already used externally. A while ago, I wrote a Perl script that analyzes dependencies between .class files, and it shouldn't be too hard to adapt it for this purpose.

When I wrote the first entry to this diary, I planned to follow it up reasonably regularly. I didn't, but that is partly because I spent most of December on vacation. The new year seems like a good time to try again.

So to begin with, a few impressions gathered during that vacation.

  • For two weeks, I played at being a tourist in London and England, the city and country I lived in for most my life, doing all the tourist highlights. England really is an expensive place to be a tourist, but overall seemed decent value for money.

  • I drove a lot. I hadn't driven for almost 2 years, and wondered how I would find it. It was fun. The roads were clear, the weather was good, and I was driving through some pleasant places, which probably had a lot to do with it being fun.

  • I spent Christmas with my parents in London. I wasn't there last year, and missed it greatly. There is something strangely magical about watching TV with a full stomach and a decorated piece of tree in the room.

  • I returned to Moscow, where I now live, shortly before the new year. Three weeks away was enough to restore the sense of foreignness I felt when I first came here, the feeling was short lived.

slamb has a point.

In my previous article, I was referring to debuggers as tools for single stepping a running program, inspecting data, setting breakpints and watchpoints, etc. When programming in C and C++, I did use debuggers as slamb describes: for obtaining stack traces upon segfaults. But in the last couple of years, the majority of my development has been in Java, where exception stack traces make use of a debugger for this purpose unnecessary. And before that, when I was mostly programming in C, I implemented a simple stack trace facility, a bit like the glibc libSegFault or the Linux kernel Oops reports, and used it in a few projects (multithreaded programs, since at that time on Linux, threads and gdb did not mix well, and multithreaded core files did not work at all).

So I was referring to debuggers in a particular sense, but in my defence, I think the same sense was used in the articles which I linked to.

Debuggers considered harmful

In a recent article, Bob Martin talks about the interaction of Test Driven Development and debuggers.

In an old post to linux-kernel mailing list, Linus Torvalds opines on the use of debuggers in kernel development. But I believe that his arguments, if they are valid at all, may apply more generally. Kernel development is not qualitatively different to the development of any large, complicated piece of software, with strong stability, performance, and compability concerns and real-time constraints, though it is special in the way that it brings all of these issues together.

The arguments presented by the two essays have a lot in common: Essentially, debuggers encourage the bad habits that lead to bugs (which reminds me of the argument that more roads mean more traffic).

I've very rarely used any debugger. I find that most bugs are either very easy to resolve, or very hard to resolve. For the easy cases, there is no need for the helpthat a debugger provides. For the hard cases, the debugger doesn't actually help: bugs are usually difficult to resolve because they are timing dependent, involve race conditions, or relate to subtle conditions in large amounts of data, where the cause is very distant from the immediate problem ("how did it ever get into that state?"). Maybe in the future, more sophisticated debugging tools will help with these difficult bugs, but I'm not optimistic: It seems more profitable to pursue design and programming techniques to make these bugs less likely to occur in the first place.

I think that Test Driven Development probably serves to separate these classes of bugs even further. Once the implementation passes a reasonably comprehensive set of tests, the remaining bugs are either really really easy, or really really difficult.

I would expect that the situation where debuggers are most legitimate is the writing of exploratory and "throw away" programs, where it is assumed that the the programmer will always be there to sort things out if the program fails, and a debugger is the ideal tool for sorting things out. It's interesting that in the world of scripting languages, where such programs are most common, debuggers do not seem to be widespread. For instance. while there are debuggers for Perl (ActiveState has one), they do not seem to be a part of the standard Perl development toolkit, in the same way that they are part of almost any C/C++ development suite.

On the other hand, the dynamic languages crowd (especially Lispers) advertise their ability to examine programs and their data at run-time, and modify both of them without restarting. In this case, the development environment does not merely include a debugger, but subsumes it into the language.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!