The Wayback Machine - https://web.archive.org/web/20170630023515/http://www.advogato.org/person/cdfrey/diary/50.html

21 Feb 2009 cdfrey   » (Journeyer)

Dear apenwarr,

    I'm sure you know all these things, but I'm afraid I'm as compelled to respond to you as you were compelled to write your own incompatible XML parser.

    I do believe you are missing the point of XML. Yes, it is a horrendous pile of textual complexity, piled high to the sky with syntax and nested markers. Yes, the available API's, in order to deliver the promise of XML to the end user via the application programmer, are complex enough to fall out of your head as soon as you've finished implementing the feature.

    But if you're going to do XML, please do it right.

    Interoperability is hard. Anyone can write their own parsers. And everyone has. That's why the monstrosity called XML was invented in the first place.

    It all starts with someone writing a quick and dirty parser, thereby creating their own unique file format whether they realize it or not. And since they probably don't realize it, they don't document it. So the next person comes along, and either has to reverse engineer the parser code, or worse, guess at the format from existing examples. This is then documented, either in Engrish, or in yet more code, which is likely to have at least one bug in it that makes it incompatible with the original format. This bug means that the second programmer has, whether they realize it or not, created their own unique file format.....

    Oh dear.

    The DTD is the documentation. XML kinda forces you to create it. The nice thing about DTD is that the computer can check it, which means it is less ambiguous than the pages of documentation that came with the old format. And with DTD + XML data, the computer can verify both, guaranteeing that the next programmer who gets the data can parse it the way it was meant to be parsed.

    If your customer had used this XML process, you could have used the Big Fancy Professional XML Library. But instead, your customer is using their own UnknownXML, and you've created ApenwarrXML. The next programmer to come along may be forced to create yet another ThirdXML version, and the pain keeps propagating.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!