Older blog entries for dsnopek (starting at number 13)

Wow. What a wonderful bunch of days. This passed Friday we threw a party and my old friend Andrew came to visit from Florida. He is going to Full Sail to learn how to do 3D computer graphics and animation. We used to work on games together and now that he is in town, we're going to see if we can put together a Bomberman clone in that short period of time.

The day after the party (here comes the bit I called wonderful), my home got broken into and robbed. At first I thought that we were robbed my morons, until I finally realized all that they took:

  • Super Nintendo
  • VCR
  • VHS: Spinal Tap
  • My Laptop

I hadn't realized that the laptop was gone util a couple of hours ago. I am the moron. This is why I am growing really tired of this neighborhood. In the time we have lived here, my roommate has gotten randomly attacked twice and my upstairs neighbor once. I have yet to be subject to random violence but its wearing me down still. If only I had enough money to live anywhere else! I am a smart guy! Someone, pay me to write Free Software! Any software! I may as well sell out if it'll save my life ...

Xmldoom

I came up with a really cool string expansion framework and XML syntax. This will allow me to finally implement optional and list arguments. It integrated into the runtime engine nicely and I won't have to change it much. The compiled format will have to change a bit for named arguments and I am really getting tempted to sit down and perfect it. Unfortunately, that will eat valuable time.

Fortunately, I can easily make the code generation backend require all arguments instead of having to really implement it. The code generation code is bocoming a chore but it is still really useful for PHP development. The multi-key object won't be as easy to ignore. I will simply have make the code generator crash when asked to build one. Doing an actual implementation would require modifying the generated code's API drastically.

Okay folks, this is the way not to do things. Yesterday at work, I managed to erase all of the status information in our database. The status information is how we do all of our accounting and the only way we know where any given item is. Fortunately, we have been able to restore about 98% of the information from the last 6 monthes. All information before that is lost forever.

Here is what happened: I just added the complex where statements to Xmldoom. I tested them, but not very well. There was a bug that essentially stopped complex statements from being passed to the above statement. A simple fix. But while I was looking at the code I made another change that did the opposite of the first bug. Complex wheres worked, but simple where's weren't passed up properly. When looking at the code it seemed like an obvious optimization (one less loop). But anyway, it cause the property Set* methods to ommit the key! So what should have been:

UPDATE table SET data = 'yup' WHERE id = 0

Became:

UPDATE table SET data = 'yup'

Thus setting the value of the data column to 'yup' for EVERY SINGLE ROW. The really sad part is, that if I had simply ran the unit tests before running my script, I would have noticed them fail. I have included as many checks in my code as possible to avoid this EXACT mistake. But alas, my ability to screw up vastly exceeds my ability to double check my work.

I don't think they'll fire me today.

Xmldoom

Just an update on the list I posted on Sep 9th. I completed complex where statements (CVS update comming this afternoon). I haven't had a chance to test in real operation but the SQL is generated properly. I got the parser working for optional, list, and named arguments. But the compiler currently ignores these. This is going to require massive changes to the compiled format. The XML format and the parser now allow multi-key objects, but the compiler will just crash when given them. Nothing else has been touched.

I am still trudging on!

10 Sep 2003 (updated 10 Sep 2003 at 22:52 UTC) »

I started work on the Xmldoom argument stuff as laid out in my last diary entry.  I started workng and things were going well but then I chased this red herring for much too long.  The libxml2 Relax-NG validator said:

Extra element string in interleave
RNG validity error: file temp line 7 element string

It led me to believe that I had setup the <interleave/> in my schema wrong.  After all, I hadn't quite figured out if I had it right.  I want the ability to mix-and-match any number of column types in any order which is something I had lots of trouble with in W3C Schema.  And a quick look showed something like this:

<int name="id"/>
<string name="name"/>
<int name="quantity"/>

So obviously I thought the schema wanted <int/>, <int/>, <string/> and was wondering what the hell this <string/> smack in the middle of the <int/>'s was.  But alas, a quick run through Sun's MSV and I got a simple:

Element "string" is missing attribute "size".

After seeing this, it all flashed into my head: libxml2 is reported layered errors in reverse order.  It has some sort of element order handler that does the <interleave/> and <choice/> tags.  This handler makes sure order is good and then passes control to an element handler.  This handler sees the missing attribute and says, "validity error: ... element string" and returns to the interleave which exclaims the "Extra element string in interleave."  Wow, do these error messages suck.  Sun's MSV is such a beautiful peice of code.  Unfortunately it is written in Java and licensed under the Apache license.  I could provide a hook to allow external validation but this a secondary solution.  I know the last thing I need right now is to get side tracked but the best solution would be to fix libxml2.  Its LGPL and written in C with a powerful Python wrapper.  Everything a man could ever want!

10 Sep 2003 (updated 10 Sep 2003 at 03:37 UTC) »

Alright, I am compiling a list of all the things I need to fix in Xmldoom before the next release.  Most of this is in previous diary entries but I need it all in one place:

  • Method interface stuff:
    • Argument lists. For use inside complex where clauses. Ex. A list of status's to search for.
    • Optional arguments. When data isn't given it isn't added to the WHERE clause.
    • Named arguments. For keyword arguments in the languages that support them.
    • Complex where statements. Allow nested AND and OR, with various <constraint .../>, <argument .../>, and <argument-list .../> tags inside.
  • Connecting table work:
    • Multi-table SELECT in SelectQuery.
    • Rework the KeyTree, Join code.  Make table <join .../>'s more low level and seperate the types into individual clases.
    • Multi-key objects.
    • Transaction objects.
  • Fix Autologon in the code generator.
  • Work on improving the compiled format (Optional).

If I actually work, this is about a month or two of work.  If I work to the full of my ability, its a solid month.  Unfortunately, this list is really daunting and all of my projects dependant on Xmldoom need this stuff.  So it wouldn't be unlike me to start working on a totally unrelated project to avoid it.  Let's hope for the best.

Okay, the Xmldoom definition format needs to change drastically.  There are a couple of new rules that need to go into effect regarding the object structure inorder to allow for more complex operations.

  • No "master" objects. Previously, every database would have a "master" object that wasn't attached to an object table which would be able to add parentless objects. These will be replaced with the ability to make "object" objects without any table. You can have an abitrary number of these. "Why would you ever need multiple "master" objects?", you ask. Well, eventually you'll be able to add and extend the "object types" in both the PyRE and the generated code. You can break-up Xmldoom functions into logical objects and then add non-Xmldoom functionality to them that address the same logical seperation.
  • An object key will refer to a whole table and all its primary keys. This means we will have object with more than one key value.
  • Transaction objects. This is a totally new idea for Xmldoom. Its an object that inherits properties (plus gets and adds) from a table object and additional properties (not sure about gets and adds) defined on a connecting table. For example, consider the tables: items, orders and items_ordered. "items" and "orders" are obvious, but "items_ordered" is a connecting table that attaches items to a particular order along with a quantity and sale price. We want an Order object with an AddMethod like: "Order.AddItem(item_id, quantity, price)" and a GetMethod that returns an Item object augmented with the properties "Quantity" and "Price".
  • Table-less objects can't add any object that has a foriegn key reference or parent.
  • Table objects can't add any object that doesn't have foriegn key reference to (or parent of) its primary key.


Overall the current xml format has outlived the original design goals.  It was never meant to do all the shit I am trying to make it do.  But I just don't have the energy or the insight to rewrite it now.  What I really want is an "object only" version.  It is getting really annoying dealing with table definitions.  I have gotten alot of good xml design ideas from Relax-NG which absolutely soars in abstraction.  But a ground up change is something I would like avoid right now.  Once I start working on the compiled xml format (which knows nothing about tables at all), I'll start porting the object specfic features back into the definition.  Hopefully, that will give me a real understanding of the abstraction required for an "object only" format.

I give this a big "Argh!"

I decided it was finally time to add multi-layer joins to Xmldoom because I needed to create MANY-TO-MANY relationships.  For example:

<table name="orders">
    <columns>
        <int name="order_id" unsigned="true" auto="true" primary-key="true"/>
        <datetime current="true"/>
    </columns>
</table>
<table name="items_ordered">
    <columns>
        <int name="order_id" unsigned="true"/>
        <int name="item_id" unsigned="true"/>
    </columns>

    <!-- TODO: could use a more succinct syntax? -->
    <join column="order_id" link="orders.order_id" relationship="parent"/>
    <join column="item_id" link="items.item_id" relationship="parent"/>
</table>
<table name="items">
    <columns>
        <int name="item_id" unsigned="true"/>
        <string name="description" length="80"/>
    </columns>
</table>

Anyway, we would obviously want to be able to retrieve the items on the order.  This is where the multi-layer join comes in.  We ask definition, "How do I get from orders to items?"  The answer is:  

orders.order_id -> items_ordered.order_id
items_ordered.item_id -> items.item_id

In the end, we'd have a Order.GetItems() method that would return the items on the order, right?  Well, not exactly.  What if items_ordered contained more data about each item ordered?  For example, the quantity or the sale price.  Not only common but almost essential.  So I could almost see the need for a new object: ItemOrdered with properties for Quantity and SalePrice.  But this isn't exactly what we want because an object must have a single object key.

Here are a number of options:

  • Allow an object to be identified by two keys. This also opens the possibility for simply declaring the keys in the table and <object-key ...> will only refer to a table. This would also remove the need for <unique/>.
  • Performing the two-layer join but also merging the Item object with the ItemOrdered object so that we can get all of its properties without another query. Or maybe return a dict or tuple containing the packed Item object and the extra ItemOrdered data? Maybe the ability to have ItemOrdered be a descendant of the Item object?


The answer will probably be a combination of the above ideas.  I am just upset that I never thought of this situation before now.

I recently decided it was time to start working on an XML Schema for Xmldoom. This led me down a path I never knew existed. First, I started working with the XML Schema format from W3C because I thought it was the "standard". The format was difficult to understand but I did manage to learn how to use it. Then I started looking for a validator so that I could test my schemas. I tried xmllint (from libxml2) because I was always very fond of libxml2 from my SAGElib days. Anyway, it coughed on some things I thought should *definitely* had worked. So I searched there mailing list for infos and found that W3C support was incomplete but Relax-ng support was complete. At first I thought, "Oh man, I have to find a new validator." Then I read another post on the libxml2 mailing list, bashing the W3C format altogether and supporting Relax-NG. After battling all day with the W3C format, the arguments rang clear.

And now, I am a relax-ng man! Expect Xmldoom schemas very soon.

OK- This is just proof that I write many too many custom projects to accomplish something simple. In Pyml, we need a simple mechanism for adding entries to the sys.path. Currently, I use this snippet:

<?py import os, sys; sys.path.insert(0, os.path.join(PymlPath, "..")) ?>

While this works, it is clunky in size and is not totally correct. What we really need is something that easily addes the path to sys (possibly without import'ing os?) and then removes it after execution (can we have seperate sys.path's for each script?). A possible syntax includes:

Does os.path.join(PymlPath, '..')
<?path '..' ?>
Dose os.path.join(PymlPyth, '..', 'lib')
<?path '..', 'lib' ?>

My temporary solution is to add another .htaccess to each "top-level" directory.

2 Sep 2003 (updated 2 Sep 2003 at 14:34 UTC) »

In working progressively on an Xmldoom database that cannot be destroyed inbetween extensions, there are a couple of features that would be useful for SQL script generation:

1. Add a --partial option that will only generate the definition for the table given on the command line. (This I need now!) That way you can add tables to an existing database.

2. Add an --alter option that would cause the SQL script to use ALTER syntax (instead of CREATE) for each table specified. (This is a "blue sky" feature, ie. I don't need it immediately). This way we can specify all the new and changed tables with --alter and --partial. The only weirdness I see is if you want to "alter" only a single table: "xmldoom --alter table1 --partial table1".

In the future, I could see setting up a build system where each table is held in a seperate file (either: the main .xml file is generated at build time OR with a cool new tag like <include filename="..."/>). And when the table definition is changed, a sql script is created to alter database which is then automatically updated. We could even do it with a single .xml file with SCons and a new node type!

Just singing about the sky ...

4 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!