Older blog entries for jeffalo (starting at number 6)

Castor article published on the ONJava website.

I forgot to ask for money. :-(
However, it felt good just to have something published after my book on XML Schema was cancelled at virtually the last possible moment. grrr...

But I'm over that now. Somewhat.

Other things:

Still(!) working a pseudo-universal namespace transform (PUNT) for XML namespace prefixes. I've got the encoding bit working (except for prefixes in content, but that's easy). Found a serious bug in Xerces 2.2.0, where it was dropping the uri values on the endElement() callback. Version 2.0.2 doesn't have this problem.

Decoding a PUNTed prefix is just a question of pattern matching through regex and replacing the binary-encoded nonName characters with the original URI characters. And, of course, add any missing namespace prefix declarations (which is the whole point of PUNT).

No, a PUNTed document is really not XML, howevever it is XML conformant. There's a difference. XML documents should above all be readable, although we're well past the point of that actually being the case in practice, even within the W3C techs. Just look at XSLT :-)

But there are a number of problems with Namespace prefixes, and PUNT is a brute-force approach to tightly binding URIs with their local element and attribute names. Really ugly, but hard to screw up. Which is the point.


Also thinking more on a typed metalanguage. Jeni Tennison and Uche Ogbuchi are exchanging ideas on a very interesting thread on xml-dev right now. Ideas worth stea^H^H^H^H borrowing.


PUNT has given me the opportunity to play around with XML filters in SAX. I haven't read a really good article on how SAX XML filters work and what they're good for, though.


8 Aug 2002 (updated 8 Aug 2002 at 19:35 UTC) »

Just a few quick updates:

1) Looks like the Castor article is going to a webzine rather than the website. I'll let you know...

2) Made some progress on PUNT, which is a set of filters for converting Namespace prefixes in XML to encoded URIs. Results in real ugly documents, but it may be a useful tool for those still operating in namespace-unaware environs (or where namespaces are just handled badly by some third-party joker who won't listen to you).

3) Thinking more about a typed metalanguage (TML) that is backwards compatible with XML. I've had a couple of interesting debates with people on xml-dev of late.

For instance, Simon St. Laurent's a big fan of loose constraints, lexically specified, whereas I'm more the semanticly-undestood datatypes advocate. The twain have to meet somewhere, though. Further exploration of the TML concept should help delineate those boundaries. The XML Schema datatypes guys have pointed the way, although they got lost somewhere on the trail, probably attributable to politics and compromise (and deadline pressure).

There's something to be said for being a cowboy.


Okay, creative differences are getting sorted out. More fun on the way.

Well, Castor and I are having creative differences about what kind of documentation they need. Still, they're adding new docs, which is good. Any new information on their site regarding SQL binding is better than what's there now.

In the meantime, I'll finish up what I started and get it out in view somehow. Hmmm, if I could just get enough people to cert me here...

Still working on the SQL Binding docs, although I haven't touched them in the last two weeks. Trying to get an article accepted on applying dependency analysis in XML and "normalizing" the document structure.

Of course, this will raise the hairs on the back of the doc-heads, so I have to be careful that I'm talking about XML-formatted data records. I can't see why the rules for normalizing data records in XML would be any different than that for fixed-length data records or CSV records.

Maybe someone could clue me in to some fundamental difference, but it would be hard to convince me at this point. I've scrubbed too much bunged-up data in the past to believe that normalization isn't worthwhile.

Dear me:

Got started on the Castor SQL binding docs. Pretty much followed the first two sections of the XML binding doc as a template. Now I'm stuck for a single good example, although I have a rather lame one that will work.

Spent considerable time on Saturday experimenting with hiding the Castor-genned classes (from an XML Schema source) behind a facade of objects inheriting from them. I'm not sure how viable the idea is, but two Castor features would be necessary to even try to make it work:

  1. null extension mappings (class extending another, but having no fields and using the same table as its base class mapping)
  2. the generation of interfaces from XML Schemas in addition to classes implementing them

Oh well... now I'll have to generate classes when the schema changes and hand merge them with changes already made to existing generated classes. Pain in the butt, but long term that's probably how it would end up, anyway.

Maybe a good beans editor could replace the XML Schema code generator so that I don't have to write those annoying accessor/mutator methods.


Dear me:

My first post here. Let's see if the habit sticks...

Current projects:

Write Castor JDO for SQL document. Some of this, at least, will wind up on the Castor website as they have a sore lack of documentation in this area. It cost me a week of pain to learn how to make the JDO mapping work; I hope to reduce that for others to about a day.

Need to get started on an XML Namespace filter that replaces namespace prefixes in an XML document with a Name- character encoded namespace identifier. The target document will still be a well-formed XML document, but the effect will be to replace QNames with encoded universal names (UNames). I have a feeling that some will see this as abuse of XML (encapsulating information w/o use of markup), but I'll leave it up to the practitioners to weigh the pros and cons for themselves.

Free advice: Listen to the experts, but don't let them decide for you.

Other immediate plans: Write XML Schema macros and plugins for Arachnophilia. The result will be a passable free XML Schema editor.

Try to find something useful to do with XSLT. So far my work just hasn't required it, which probably means I'm not looking hard enough.

Current ICC rating: 1808

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!