Older blog entries for dsnopek (starting at number 18)


Lately I have been doing a whole bunch of hacking on Pyml. I really want to get the low-level stuff handled so that it is vagely real-world usable. After that, I can hope for someone to even care about the userland API. Mainly, this effects the parser and compiler.

I started by working on the parser which had a few known problems with determining the various PIs properly. It was based on a single fairly complicated regular expression. This was great and mostly worked -- tweaking the regex ad infinitum could have fixed it. Unfortunately, it made acurately determining and maintaining the line numbers impossible. This is very important to ever having a Pyml line debugger. I tried a lex/yacc parser but settled on a quirky mode-based buffer thing. Not pretty but it gives me lots of control.

The initial Pyml compiler simply generated Python source code and passed it to the CPython compiler. It put everything on its own line since Python is very sensative to white space. This worked but, again, the line debugger! So I started attempting to join code that would be on the same line using the ';' symbol. I found some pretty esoteric Python syntax errors this way! They could be avoided by putting everything in a line continuation inside an exec() statement. Hey, this worked but created really ugly bytecode when disassembled using the "dis" module.

About here I started thinking, "Wow, this would be so easy if I could just generate by own bytecode." Embarking on a wild goose chase I explored compile.c, the "compiler" module and the ByteCodeHacks project. To summarize many hours of gimacing at the computer and writting on little peices of paper, this is a no go -- especially if we want to support many versions of Python portably.

So here is my solution: continue to generate python source and use ';' to connect lines. But I am going to use the "compiler" module to generate an AST to determine if the code fits the strange rules Python applies in this case.

One of these days, all this may result in some actual code...

First, I'd like to say that its great to have Advogato back!

The House That Xmldoom Couldn't Build

My parents are selling their home without a realator. So I made a web site for them to advertise the house. I wanted to build it as quickly as possible and then get it off my hands, so I decided that they needed to be able to build/edit it without me. Because of bugs I have been experiencing at work with Pyml and Xmldoom, I just threw together a Python CGI script with direct access to MySQL. Of course, I made these tools to make web development faster and easier -- mission not accomplished.

My parent's house

Assuming that Pyml was fixed up, I could easily have used it and maybe even saved some time. But the real problem is that the way I used SQL wouldn't have worked with Xmldoom, so I would have had to do all kinds of ugly "stuff" to accomplish the same thing.

The point is, that the current Xmldoom design is terribly limiting. I think, that soon I am going to have to make Xmldoom more like Propel (a similar PHP softwares) for Python. It has a very flexable runtime SQL generating engine. I can build the high-level XML definition stuff that Xmldoom has on top of that. Then I can work to expand the high-level stuff to cover all cases while still providing a low-level solution.


Before deciding that I would do the architectual re-write described above, I had already fallen in love with Creole (Propel's database abstraction framework) and planned to make a Python version called Roma (for the language spoken by the Romani, known as Gypsies in the US. Its a hog-pog combination of many European languages, which makes me think of how speaking SQL is!). I never really liked ADO but Creole is badass. Apparently, the author based it on JDBC. Which is great because then Xmldoom can use Creole for PHP, Roma for Python and JDBC for Java, this way continuing its language agnostic nature.

Anyway, while Xmldoom is reasonably useful in many cases, I think it has so far served as a toy to help me learn Python and DB programming better. Now I think I understand the problem and have done the research enough to really move Xmldoom into the realm of usefulness.


But before I gut Xmldoom, I want to really spruce up Pyml and finally make a release. Since Savannah still doesn't have CVS commits back up, I am going to move it to Gna. I am working on fixing the parser, adding configuration files, and making a Python-based web server that will allow you to enter the pdb (Python debugger) and interactively step through your scripts. As far as I know, no web scripting language provides this feature but it seems really useful.

Thats enough for now...


I am making a strong attempt at quitting smoking. Cold turkey like always. Today is not a good day to die ...

Also, I am starting work on the security camera software which I guess is going to be Nicholas and my current money making scheme. Mix that with starting on a new wine recipe and I should have more than enough to distract my nicotine addicted brain.


I still have a ton of recieving work to do, but since inventory is tommarrow, I can't touch any of it. I think I'll leave soon to get to the really important stuff.

Ah, back to the land of the living or maybe just a little closer..

I just got back from a month long vacation in Ireland. Call me an escapist -- its probably true. And now I hope to get back to some real work. My immediate plans are to "product-ize" Xmldoom and attempt to make some money off of it. Very unlikely considering that it is a totally unknown project with absolutely no community around it. Which is why my next steps include writting user documentation. If I can attract users, I can get developers and then maybe some popularity.

Last year I e-mailed with Richard Stallman about the moral and philosophical implications of dual-licensing a product as GPL for free and proprietary for a fee. I figured that if anyone would find fault it would be Richard. Surprisingly, he thought that this was not contrary to the ethos of Free Software. I like to think of it as taxing proprietary software -- if only to help me sleep at night.

So the plan is allow anyone to build applications with Xmldoom under the GPL but if someone wants to make a non-GPL program out of it I rape them with license fees. Unfortunately, there are a number of problems with this:

  • Code Generation: Albeit the support for generating client code in Xmldoom is broken (in favor of using the runtime engine), it could be used to circumvent the GPL. Since the output code will carry the same license as the input code, Xmldoom becomes a pass-through entity and can't restrict its uses to Free Software.
  • Web Applications: My current primary use of Xmldoom is for web applications. Having the code used by the WWW doesn't constitute "distribution" in the GPL, thus allowing the user to license their app under any license they desire.
  • Non-GPL Runtime Engines: This is a problem similar to the GCC vs GCC-XML issue. The GCC guys don't like GCC-XML because it could allow a proprietary application to use the C++ parsing abilities of GCC to compile code without abiding the GPL. Xmldoom allows you (via the command line tool) to compile the input XML definition into a 'compiled' format that the runtime engine can use. This is useful for me because only one compiler need exist but I could easily write a runtime engine in any programming language I choose. This is bad because someone could write a proprietary runtime engine for their application again avoiding the GPL. Of course, currently only one runtime engine exists so we're safe for now.

These are all roughly symptoms of the same disease but they affect the sustainable Free Software status of this project as well as my ability to profit from it. Maybe its time I write Richard another letter ...

Hey guys. I don't know if anyone is truly interested in the dribble of some Free Software hacker and his fucked up life but I am running out of outlets for for my frustration.

I am having woman troubles. She left me. But I think there is hope yet to salvage our relationship. And on top of that, I may lose my job soon. Not to mention my home just getting robbed on Saturday.

Maybe this isn't my lifetime ...

Wow. What a wonderful bunch of days. This passed Friday we threw a party and my old friend Andrew came to visit from Florida. He is going to Full Sail to learn how to do 3D computer graphics and animation. We used to work on games together and now that he is in town, we're going to see if we can put together a Bomberman clone in that short period of time.

The day after the party (here comes the bit I called wonderful), my home got broken into and robbed. At first I thought that we were robbed my morons, until I finally realized all that they took:

  • Super Nintendo
  • VCR
  • VHS: Spinal Tap
  • My Laptop

I hadn't realized that the laptop was gone util a couple of hours ago. I am the moron. This is why I am growing really tired of this neighborhood. In the time we have lived here, my roommate has gotten randomly attacked twice and my upstairs neighbor once. I have yet to be subject to random violence but its wearing me down still. If only I had enough money to live anywhere else! I am a smart guy! Someone, pay me to write Free Software! Any software! I may as well sell out if it'll save my life ...


I came up with a really cool string expansion framework and XML syntax. This will allow me to finally implement optional and list arguments. It integrated into the runtime engine nicely and I won't have to change it much. The compiled format will have to change a bit for named arguments and I am really getting tempted to sit down and perfect it. Unfortunately, that will eat valuable time.

Fortunately, I can easily make the code generation backend require all arguments instead of having to really implement it. The code generation code is bocoming a chore but it is still really useful for PHP development. The multi-key object won't be as easy to ignore. I will simply have make the code generator crash when asked to build one. Doing an actual implementation would require modifying the generated code's API drastically.

Okay folks, this is the way not to do things. Yesterday at work, I managed to erase all of the status information in our database. The status information is how we do all of our accounting and the only way we know where any given item is. Fortunately, we have been able to restore about 98% of the information from the last 6 monthes. All information before that is lost forever.

Here is what happened: I just added the complex where statements to Xmldoom. I tested them, but not very well. There was a bug that essentially stopped complex statements from being passed to the above statement. A simple fix. But while I was looking at the code I made another change that did the opposite of the first bug. Complex wheres worked, but simple where's weren't passed up properly. When looking at the code it seemed like an obvious optimization (one less loop). But anyway, it cause the property Set* methods to ommit the key! So what should have been:

UPDATE table SET data = 'yup' WHERE id = 0


UPDATE table SET data = 'yup'

Thus setting the value of the data column to 'yup' for EVERY SINGLE ROW. The really sad part is, that if I had simply ran the unit tests before running my script, I would have noticed them fail. I have included as many checks in my code as possible to avoid this EXACT mistake. But alas, my ability to screw up vastly exceeds my ability to double check my work.

I don't think they'll fire me today.


Just an update on the list I posted on Sep 9th. I completed complex where statements (CVS update comming this afternoon). I haven't had a chance to test in real operation but the SQL is generated properly. I got the parser working for optional, list, and named arguments. But the compiler currently ignores these. This is going to require massive changes to the compiled format. The XML format and the parser now allow multi-key objects, but the compiler will just crash when given them. Nothing else has been touched.

I am still trudging on!

10 Sep 2003 (updated 10 Sep 2003 at 22:52 UTC) »

I started work on the Xmldoom argument stuff as laid out in my last diary entry.  I started workng and things were going well but then I chased this red herring for much too long.  The libxml2 Relax-NG validator said:

Extra element string in interleave
RNG validity error: file temp line 7 element string

It led me to believe that I had setup the <interleave/> in my schema wrong.  After all, I hadn't quite figured out if I had it right.  I want the ability to mix-and-match any number of column types in any order which is something I had lots of trouble with in W3C Schema.  And a quick look showed something like this:

<int name="id"/>
<string name="name"/>
<int name="quantity"/>

So obviously I thought the schema wanted <int/>, <int/>, <string/> and was wondering what the hell this <string/> smack in the middle of the <int/>'s was.  But alas, a quick run through Sun's MSV and I got a simple:

Element "string" is missing attribute "size".

After seeing this, it all flashed into my head: libxml2 is reported layered errors in reverse order.  It has some sort of element order handler that does the <interleave/> and <choice/> tags.  This handler makes sure order is good and then passes control to an element handler.  This handler sees the missing attribute and says, "validity error: ... element string" and returns to the interleave which exclaims the "Extra element string in interleave."  Wow, do these error messages suck.  Sun's MSV is such a beautiful peice of code.  Unfortunately it is written in Java and licensed under the Apache license.  I could provide a hook to allow external validation but this a secondary solution.  I know the last thing I need right now is to get side tracked but the best solution would be to fix libxml2.  Its LGPL and written in C with a powerful Python wrapper.  Everything a man could ever want!

10 Sep 2003 (updated 10 Sep 2003 at 03:37 UTC) »

Alright, I am compiling a list of all the things I need to fix in Xmldoom before the next release.  Most of this is in previous diary entries but I need it all in one place:

  • Method interface stuff:
    • Argument lists. For use inside complex where clauses. Ex. A list of status's to search for.
    • Optional arguments. When data isn't given it isn't added to the WHERE clause.
    • Named arguments. For keyword arguments in the languages that support them.
    • Complex where statements. Allow nested AND and OR, with various <constraint .../>, <argument .../>, and <argument-list .../> tags inside.
  • Connecting table work:
    • Multi-table SELECT in SelectQuery.
    • Rework the KeyTree, Join code.  Make table <join .../>'s more low level and seperate the types into individual clases.
    • Multi-key objects.
    • Transaction objects.
  • Fix Autologon in the code generator.
  • Work on improving the compiled format (Optional).

If I actually work, this is about a month or two of work.  If I work to the full of my ability, its a solid month.  Unfortunately, this list is really daunting and all of my projects dependant on Xmldoom need this stuff.  So it wouldn't be unlike me to start working on a totally unrelated project to avoid it.  Let's hope for the best.

9 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!