Advogato: Python Database Access

SQL consists of statements to insert, update, delete and alter tables. The syntax is close to ordinary english, and is consistent enough to merit auto-generation functions to read or create SQL syntax. variants of SQL are similar enough for ODBC to have been created, and also to create database abstraction layers.

So, I set out, eighteen months ago, to create some python classes to manipulate SQL, supporting as many types of SQL databases as i needed to access (MySQL and MS-SQL 2000). starting with the statements that create WHERE, FROM and table rules, i ended up with functions on a per-SQL-table basis that can insert, update and delete from SQL database tables.

in the end, the top-level functions are so simple and structured that it became worthwhile to write auto-generating code. the auto-generating code, sqlgen.py, takes a SQL file with INSERT, CREATE and ALTER TABLE FOREIGN KEY statements and generates all the functions needed to manipulate the database.

JOINs were a little tricky. JOINs are what makes good SQL databases useful and manageable. however, by doing JOINs, you end up having to write quite complex SQL queries. the techniques i developed help avoid all the expected complexity, constructing the SQL queries on your behalf.

the db_whererule_convert function takes a list of WHERE rules that need to be ANDed together. take the list [('name', "%search%"), ("id", [1,2,3]) ] and because of the way that the function db_in_or_equal works, the result is: WHERE 'name LIKE "%search%" AND id IN [1,2,3] AND ....'.

AND was chosen rather than OR because it is more common to make rules using AND. any other rules (such as NOT, or OR) can be done by making one of the items in the list a string, and you have to do the rules yourself. out of approximately 150 functions in the example program, custom, only five to ten required extensions to the WHERE clauses that could not be covered with simple usage of db_whererule_convert.

def db_whererule_convert(whererule):
    """ turn whererule list into string.  ignore empty strings.
    """
    if type(whererule) != type([]):
        whererule = [whererule]

     l = []

     for i in whererule:
        if type(i) == type(''):
            if i is not None and len(i) > 0:
                l.append(i)
        elif type(i) == type(()):
            (key, val) = i
            l.append("("+db_in_or_equal(key, val)+")")

     return join(l, " AND\n\t\t")

once we have the whererule conversion and also an AS convert function, it becomes possible to create a complete SELECT statement:

def table_open(self, table, selectdict={'*':None},
                 whererule=None, sortrule=None):

     whererule = db_whererule_convert(whererule)
    selectrule = db_as_convert(selectdict)
    table = db_as_convert(table)

     # create sql select command
    command = "SELECT "+selectrule+"\nFROM "+table
    if whererule is not None and whererule != "":
        command += "\nWHERE (" + whererule + ")"
    if sortrule is not None and sortrule != "":
        command += "\nORDER BY " + sortrule

     return self.__db_execute(command)

then, we have a function that constructs the list of SELECT rules, a dictionary of fields to return, a list of tables and some SORT rules. the db_wheretype function takes a dictionary that contains types as keys and field names as values. in this way, if the id is an int, long, string or even a DateTime or a weird class, the db_wheretype function selects the correct field name.

def invoices_start(self, id=None, from_id=None, cust_id=None):
    # rules
    l = []

     if id is not None:
        l.append(db_wheretype(self.invoices_id_fields, id))
    if from_id is not None:
        l.append(db_wheretype(self.invoices_from_id_fields, from_id))
    if cust_id is not None:
        l.append(db_wheretype(self.invoices_cust_id_fields, cust_id))

     # fields
    fields = {'invoices.id': 'invoices_id',
               'invoices.log_id': 'invoices_log_id',
               'invoices.from_id': 'invoices_from_id',
               'invoices.cust_id': 'invoices_cust_id',
               'invoices.ref': 'invoices_ref',
               'invoices.status': 'invoices_status',
               'invoices.bank_ref': 'invoices_bank_ref',
               'invoices.createdate': 'invoices_createdate'
           }

     rules = DPyDBrules(['invoices'], fields, l,
                      "invoices_status, invoices.createdate")

     return self.__dpydb.db_open(rules)

finally, we come to example usage. Andy Dustman's MySQLdb code returns a cursor, which has a fetchone() method that the display_table() function uses to fetch... er... one row from the query, and... er... displays it.

def display_invoices_info(self):

     cur = self.dpydb.invoices_start(id=self.id,
                                     cust_id=self.cust_id)

     fields = [ ('invoices_id', "InvID", invoices_id_disp_fn),
               ('invoices_cust_id', "AcctID", customers_id_disp_fn),
               ('invoices_ref', "Cust Ref", None),
               ('invoices_id', "Cost", invoice_cost_disp_fn),
               ('invoices_status', "Status", self.invoice_status_fn),
        ]

     return self.iso.display_table( cur, html_action, "Invoices", fields)

so, i realise that this looks like quite a lot of work. however, the construction of rules - especially where JOINS are involved - can be done by adding in extra rules into an instance of the DPyDBrules class. LEFT joins are a small pain that require more thought on how to integrate properly.

in all, i guarantee you that the amount of code writing (SQL statements) is drastically reduced, with only a few hundred lines of boringly regular python code to write (or auto-generate).

in fact, the abstraction is almost at the point where an XML or LDAP back-end could be added rather than a SQL back-end, with the minimum of fuss.

last but not least i would like to thank my wife for her drunken assistance and harassment that helped me to write this article. teehee.