Older blog entries for titus (starting at number 136)

Laying eggs

I'm now firmly committed to PJE's setuptools/easy_install. It's invaluable as a way to make precompiled Python distributions available.

As part of our agile development tutorial at PyCon, Grig and I are developing an application. (More on the app anon: our first release is due by the end of the month.) The app depends on Durus and CherryPy, neither of which "just work" with easy_install on Windows.

For Durus, the problem is that it has a binary extension; you need a C compiler to build it.

For CherryPy, there's some issue with SourceForge download page breakage, and maybe something problematic with paths on Windows XP. (I haven't isolated the problem.)

So, I built eggs. I found a colleague upstairs, Brandon King, who had just gotten Windows compilation working for Python packages; after patching in

from setuptools import setup
to the top of setup.py for both packages, 'python setup.py bdist_egg' produced nice, functional eggs.

They are available at

http://issola.caltech.edu/~t/dist/

and can be grabbed with


easy_install -f http://issola.caltech.edu/~t/dist/ Durus

or
easy_install -f http://issola.caltech.edu/~t/dist/ CherryPy

I'm still having some issues installing things on Python-naive boxes (that is, Windows boxes with just a standard install of Python) but that will have to wait for the first release. (FWIW, the problem will probably be fixed by building eggs for the pywin32 code.

PyPi

Now that I've been using PyPi and easy_install for a few weeks, I'd guess that about 80% of packages are directly and immediately installable via easy_install by typing 'easy_install package_name'. I've run across a bunch that aren't, though. Those include Zope3 and CherryPy; zope.testbrowser also had a problem, but I think that was an issue with the '.' in the middle of the name.

I would be very happy if it were possible to install every package on PyPi with easy_install, and it might be a worthwhile project to highlight those that can't, for whatever reason. Hmm, could become part of the Cheesecake project... a list of all the projects that don't work with easy_install, separated into lists of those that are easy_install's fault vs those that are the author's fault. (The latter would, of course, be in bright red with BLINK tags.) Perhaps we could even call the list "Sweaty Cheese" ;).

Another fantastically useful project would be to automatically download and build Windows and Mac OS X eggs for all of the PyPi projects. Hmm, there'd be some security issues, but I bet you could work something out with public keys where only packages authorized by some key authority would be automatically downloaded. Humm.

--titus

7 Dec 2005 (updated 7 Dec 2005 at 19:15 UTC) »
Oblique Strategies

I'm a big fan of Oblique Strategies; so once I found robin parmer's python implementation, I thought why not write a quick Web site for it?

Clearly I need to spend more time on work. But it was so quick and easy... ;)

Speaking of which...

It's just too easy

The discussion on Aaron Swartz's blog about rewriting reddit & web.py illustrates a few amusing points about Python. Apart from the downright absurdity of some of the discussion so far -- general Lisp snarkiness, and Aaron's assertion that all bajillion Python Web frameworks suck (except for his, which isn't available yet...) -- I think a few truths emerge.

The main truth is that it's clearly too easy to write your own Web framework in Python. It's less work to code a few hundred lines of Python than it is to understand someone else's few hundred lines of Python; it's also easier to continue thinking like you already do than it is to adapt your thinking to someone else's API. And, most important of all, a few hundred lines of Python is really all you need for a fully functional Web app framework. Thus, our massive proliferation of Web frameworks. (As Harald Massa writes: Python: the only language with more Web frameworks than keywords</a>.)

Clearly, the only way to cut down on the number of Web frameworks is to make it much harder to write them. If Guido were really going to help resolve this Web frameworks mess, he'd make Python a much uglier and less powerful language. I can't help but feel that that's the wrong way to go ;).

Another truth that I stumbled over yesterday: it's much harder to write good, clean, maintainable, documented, tested code than it is to write a functional Web framework. Partly this is a matter of withstanding the test of time; partly it's a matter of development practices. If there's one thing I'd like to explore for my own projects, it's how to keep tests, documentation, and code all in sync. (Want to know how to do it? Come to our tutorial!)

I think the test for the future will be simple survival; this will be based on things like documentation more than on functionality. For example, Quixote, though powerful, suffers from poor documentation. CherryPy, which enforces a similar coding approach on apps, has an attractive, busy Web site. Getting started with CherryPy is simple; getting started with Quixote is not so simple. This really matters in the long run.

Picking a Web framework

The above thoughts occurred partly in the context of my own choice of frameworks for a new project.

I'm starting a new project for our Agile Development and Testing tutorial at PyCon, and I wanted to try something other than Quixote (just for novelty's sake, ya know?). My rough requirements were: must run on Windows; must not commit me to a database other than Durus, by e.g. requiring SQLObject; shouldn't be a huge framework (I'd like to distribute the entire thing in an egg); and needn't (shouldn't) be a full content-management system. So I took a look at QP, browsed around Django and TurboGears, visited web.py, and settled on CherryPy. (Actually, I started

on CherryPy and then discovered that their introductory "hello, world" example didn't work with their latest stable release, 2.1.0, and gave up in frustration. Then I came back to it after striking out with the others.)

So, why CherryPy?

Django and TurboGears are too much. Django is more of a CMS, and TurboGears commits me to too many packages.

QP is largely undocumented and doesn't run on Windows. (The former is a real problem: I spent 15 minutes trying to figure out how to run QP on something other than localhost, and couldn't manage.)

web.py? Not available yet. Maybe it will be, maybe it won't be... but it's awfully hard for me to evaluate a package based on 50 lines of documentation and no code, although I applaud the attitude.

CherryPy is just what I need: lightweight, fairly obvious code on my side, nothing complex required. We'll see how I feel in a week or two ;). I've already broken it once or twice, and I think the internal code is more complex than it needs to be (based solely on ~30 minutes of browsing around to fix the breakage) but I needed to commit to something before Grig keel hauls me... and CherryPy is clearly the best-suited of the things I looked at.

And it's got a really attractive Web site...

In other news...

Commentary

is one of the coolest things to cross my radar screen in a while. (It's an AJAX-y way of adding comments to Web sites.)

I think there's a really good application somewhere in here; something combining Raph's trust metric stuff with Commentary to make a community-wide commenting/annotation system. (I put some work into such a thing earlier this year, but decided to focus on twill first.) I hope someone else writes it so I don't have to ;).

--titus

It's just too easy

The discussion on Aaron Swartz's blog about rewriting reddit & web.py illustrates a few amusing points about Python. Apart from the downright absurdity of some of the discussion so far -- general Lisp snarkiness, and Aaron's assertion that all bajillion Python Web frameworks suck (except for his, which isn't available yet...) -- I think a few truths emerge.

The main truth is that it's clearly too easy to write your own Web framework in Python. It's less work to code a few hundred lines of Python than it is to understand someone else's few hundred lines of Python; it's also easier to continue thinking like you already do than it is to adapt your thinking to someone else's API. And, most important of all, a few hundred lines of Python is really all you need for a fully functional Web app framework. Thus, our massive proliferation of Web frameworks. (As Harald Massa writes: Python: the only language with more Web frameworks than keywords</a>.)

Clearly, the only way to cut down on the number of Web frameworks is to make it much harder to write them. If Guido were really going to help resolve this Web frameworks mess, he'd make Python a much uglier and less powerful language. I can't help but feel that that's the wrong way to go ;).

Another truth that I stumbled over yesterday: it's much harder to write good, clean, maintainable, documented, tested code than it is to write a functional Web framework. Partly this is a matter of withstanding the test of time; partly it's a matter of development practices. If there's one thing I'd like to explore for my own projects, it's how to keep tests, documentation, and code all in sync. (Want to know how to do it? Come to our tutorial!)

I think the test for the future will be simple survival; this will be based on things like documentation more than on functionality. For example, Quixote, though powerful, suffers from poor documentation. CherryPy, which enforces a similar coding approach on apps, has an attractive, busy Web site. Getting started with CherryPy is simple; getting started with Quixote is not so simple. This really matters in the long run.

Picking a Web framework

The above thoughts occurred partly in the context of my own choice of frameworks for a new project.

I'm starting a new project for our Agile Development and Testing tutorial at PyCon, and I wanted to try something other than Quixote (just for novelty's sake, ya know?). My rough requirements were: must run on Windows; must not commit me to a database other than Durus, by e.g. requiring SQLObject; shouldn't be a huge framework (I'd like to distribute the entire thing in an egg); and needn't (shouldn't) be a full content-management system. So I took a look at QP, browsed around Django and TurboGears, visited web.py, and settled on CherryPy. (Actually, I started

on CherryPy and then discovered that their introductory "hello, world" example didn't work with their latest stable release, 2.1.0, and gave up in frustration. Then I came back to it after striking out with the others.)

So, why CherryPy?

Django and TurboGears are too much. Django is more of a CMS, and TurboGears commits me to too many packages.

QP is largely undocumented and doesn't run on Windows. (The former is a real problem: I spent 15 minutes trying to figure out how to run QP on something other than localhost, and couldn't manage.)

web.py? Not available yet. Maybe it will be, maybe it won't be... but it's awfully hard for me to evaluate a package based on 50 lines of documentation and no code, although I applaud the attitude.

CherryPy is just what I need: lightweight, fairly obvious code on my side, nothing complex required. We'll see how I feel in a week or two ;). I've already broken it once or twice, and I think the internal code is more complex than it needs to be (based solely on ~30 minutes of browsing around to fix the breakage) but I needed to commit to something before Grig keel hauls me... and CherryPy is clearly the best-suited of the things I looked at.

And it's got a really attractive Web site...

In other news...

Commentary

is one of the coolest things to cross my radar screen in a while. (It's an AJAX-y way of adding comments to Web sites.)

I think there's a really good application somewhere in here; something combining Raph's trust metric stuff with Commentary to make a community-wide commenting/annotation system. (I put some work into such a thing earlier this year, but decided to focus on twill first.) I hope someone else writes it so I don't have to ;).

--titus

socal-piggies dinner meet

Anyone in the LA area who is interested in Malaysian food & Python conversation on Wednesday, contact me soon -- I'm confirming reservations for Kuala Lumpur in Pasadena, at 7:30pm. (So far we have 11 people!)

--titus

5 Dec 2005 (updated 5 Dec 2005 at 18:44 UTC) »
Two interesting interviews.

How BioWare Makes Game Communities Work

An Interview with Scott Bakker

The Lone Software Developer's Methodology

Entertaining and interesting rant (with explanation).

When unit testers fail...

At some point over the last week, my nose unit tests for twill started flailing. Yes, not just failing, but flailing. There was so much output that it was tough to figure out exactly what was going on. I spent a few minutes across a few days trying to figure out what had changed; I'm pretty sure it was due to a nose version upgrade, now, but I could never actually figure out what version of nose recognized all my tests *and* still worked.

Whatever the proximal cause, it turns out I designed my unit tests incorrectly. The tests are in modules (separated in individual files), with 'setup', 'test', and 'teardown' functions in each module. The latest version(s) of nose was discovering the 'setup' and 'test' functions and running (as near as I can tell) 'setup' twice, 'test' once, and 'teardown' zero times.

I'm still not sure if this is a bug; I couldn't figure out what the behavior of nose should be, because it's not explicitly documented on the nose page. (Maybe it's on the py.test page?)

Finally, I discovered 'setup_module' and 'teardown_module' and renamed setup and teardown in all my tests. That solved my problems.

I also learned (from the nose documentation) that you could specify a test collector in setup.py when using setuptools. So now 'python setup.py test' will run all the tests.

The shocker for me in all of this was how uncomfortable I felt even thinking about modifying twill without the unit test framework. Even though I could execute the tests individually, I knew that I wouldn't run them nearly as frequently as I do when 'nose' is working. Automated tests are a really important security blanket now...

What users want, and what they'll get.

So, it appears that an increasingly common use case for twill is Web browsing.

When I started developing twill, I intended to develop a replacement for PBP. PBP was aimed at Web testing, and that's what I needed myself. Roughly, this meant:

  • a simple scripting language, with an interactive shell;
  • Functional cookies/formfilling/etc, so that interacting with a Web site actually worked;
  • a fair amount of assert stuff, like "code" (to make assertions about return codes) and "find" (to make assertions about presence of specific text/regexps).

Then I made a few design choices: to use python and python's Cmd for the interactive shell, and to use a metaclass to automagically import all twill.commands functions into the interactive shell. This meant that all of the functions accessible from twill were also directly accessible via Python, and the twill language docs functioned equally well as Python language docs.

Thus, twill could be used from Python code to do most Web browsing tasks that didn't involve JavaScript.

I was happy with this result, but it was largely unintended. I mostly use twill via the scripting language.

However, the early press on twill was from people like Grig and Michele, who talked glowingly about the ability to use twill from Python. This has led to people wanting more functionality: especially, sensible return values from twill commands. This, in turn, has led to a bit of caviling on my part about this direction, because I haven't really thought it through.

Anyway, the upshot is that I have to rethink my priorities for twill a bit. I was going to focus on recording functionality and extension management for the next release, but it seems like I should also work on simplifying and documenting the browser interaction code. Given the one-to-one mapping between twill.commands and the scripting language, I don't want to add things like return values and Python-specific documentation to the commands; perhaps I can satisfy people with a browser-object interface...

The big surprise for me -- and it really shouldn't have been a surprise -- is that people really seem to want to do Web browsing via a trivially simple interface: go, follow, submit, and get_page contain 90% of the desired functionality. mechanize and mechanoid are serious overkill for this level of Web browsing, and the fact that twill puts a high priority on "just working" (albeit at the expense of customizability) probably helps contribute to users' interest.

The simplest route for me to go is probably to work on two applications I've been planning: twill-crawler, and mailman-api. (Presumably the names give away the function ;).) Then I'll have some idea of what people need for Web browsing; right now I'm feeling a bit underplanned, so to speak.

--titus

Review of "Endless Forms Most Beautiful"

A review of "Endless Forms Most Beautiful" by Sean Carroll. (This is a book on evolutionary developmental biology, which is one of the things my lab works on.)

In-process WSGI testing

Reached a stable point with a little side project: wsgi_intercept. This is a broken-out version of my in-process WSGI testing stuff

that works for all of the Python Web testing tools I could find. Specifically, I monkey-patched webtest and webunit; provided a urllib2-compatible handler; and subclassed the mechanoid, mechanize, and zope.testbrowser Browser classes.

I ran into some minor but nonetheless annoying problems along the way. For example, Zope (which is needed for parts of zope.testbrowser) cannot be installed with "easy_install zope"; the PyPi page doesn't link to a download page, and even when downloaded, Zope calls 'setup.py' 'install.py' instead. This confuses easy_install.

I also couldn't figure out contact info for either Zope or CherryPy (the owners of webtest). I didn't look terribly hard, but the PyPi contact e-mail for Zope is a subscriber's-only list, and <team at cherryp.org> (which is the e-mail address at the top of webtest.py) doesn't exist. CherryPy folks -- someone, please contact me if you want the patches to webtest (or just grab them from the wsgi_intercept code yourself, of course!)

And easy_install seems to be confused by packages with '.' in their names; zope.testbrowser doesn't install with just an 'easy_install zope.testbrowser'. (Spoiled, aren't I, to expect it all to work so easily!)

But these are only minor gripes. On the whole, the packages I downloaded and modified had nice, clean source code. I think there's something about people who write testing tools that leads them to clean up their code ;).

A few quick off-the-cuff opinions, while I'm at it:

  • zope.testbrowser is indeed a simple, clean interface to mechanize. (Python2.4 only, however.)

  • I like the way mechanoid (a fork of mechanize) has broken out the class structure of mechanize into files.

  • If I needed a minimal Web testing tool, I'd use webtest.

  • funkload (based on webunit) looks pretty neat. (In the Python world, it's probably the major competitor to twill; they're focusing a bit more on load-testing, though.)

  • people should use ClientCookie rather than urllib2, I think. It does more, and it's written by the same person ;).

  • One of mechanize's big problems is its retention of backwards compatibility. John seems intent on keeping python 2.2 and up working in mechanize; I think that complicates the code more than it should.

Anyhoo, g'nite.

--titus

twill 0.8

"85% unit tested". ;)

PyPi entry, announcement, ChangeLog.

--titus

Tidy

I spent an hour or two on Sunday adding a tidy preprocessor into twill.

There are a lot of tidy Python implementations out there: ElementTree tidylib, ElementTree TidyTools, pyTidy, mxTidy, and

utidylib. Some of them (elementtree) are part of other packages or require stuff that I don't want to bundle or require (utidylib requires ctypes); most of them require the tidylib binary and then interface with it. Because I want the twill distro to be cross-platform, I decided to go with the approach taken in ElementTree TidyTools, which relies only on the command-line binary. Inspection of the code revealed that it simply executed os.system, without much in the way of error trapping, so I ended up rolling my own (search for 'run_tidy'). Whee.

So, in the next release of twill, it will automatically preprocess stuff with tidy unless you turn it off; you can also assert that pages have no 'tidy' warnings.

Eggz Rock

The (imminent) next release of twill, twill 0.8, will include support for Python Eggs.

When I started, I was worried about a few technical issues: for example, I include pyparsing and mechanize/ClientForm/ClientCookie/pullparser within the twill distribution, and then munge sys.path to load them first. How would this work with eggs? No problem; the same path-munging code works whether I'm loading from a directory a zip file. (I just use os.path.join.)

Version numbering: would upgrading etc. work nicely? Yep. The pkg_resources version handling is so smart, it's not even inspired. (...by which I mean that it's brilliantly simple.)

As a bonus, it will be even easier to distribute "development" versions of twill. I can just build an egg with an alpha version number, e.g. '0.8.1a1' or '0.8.1a2', link to 'em on a page, and then point people to that page. easy_install will do the rest. In fact, I don't even need to build the page manually: I can just tell Apache to make my development dist/ directory available to the public via "Options +Indexes".

For example, typing

easy_install -f http://issola.caltech.edu/~t/twill-dist/ twill

will automatically scan for the latest version and install it. Nifty.

So far, my main gripe? 'ez_setup' is an ugly name, and it's an ugly file to have sitting around in my main development directory. (You may recall that I dislike cluttering up the main directory. So call me picky ;).

--titus

Today is a day for... Miscellany!

ORMs

Re my long post on object-relational mappers:

Jonathan Ellis points me towards a fairly negative post on PostgreSQL table inheritance, which cucumber2 uses. The thread basically states that no one is maintaining table inheritance and that only inertia is keeping it in the code. My impression was somewhat the opposite: I've seen statements that table inheritance will not be taken out, because there are people using it. *shrug* It's a neat feature, IMO.

Jonathan also points me towards PyDO2, which seems to have good documentation and a philosophy that supports working on the database with other tools. I've seen PyDO before but never had a chance to play with it seriously. I like the look of the code, though, on a cursory inspection.

Runar's Blog (written by Runar?) has a long post on relational model vs Python. Haven't finished digesting it yet. One particularly interesting link (broken in that article) is to SQLAlchemy.

An Open-Source Story: Producing Error-Free Software is Hard

Via RISKS, this story on an optimization bug in gcc (or so I infer) that affected X, and perhaps many other pieces of code. Whoo.

Python Docs

Stephen Ferg e-mailed me about my earlier post on Python docs. He pointed me towards a long, fascinating thread on Python doc updates.

I'll go into this more later, but it's worth mentioning that anecdotal evidence from genome annotation suggests that the PHP model (of allowing at least somewhat uncontrolled posting of information to docs) elicites far more contributions than rigorous up-front quality control. The reason? Experts won't go out of their way to add information on something they understand well, but they will put in the time to correct something that's just plain wrong. So you've just got to put in mechanisms to facilitate this kind of interaction.

Using arch/darcs from Windwows

In a response to my open source project truisms page, Moof points out that darcs and arch don't work very well for Windows. I'm sure he's right: I tend to forget about that platform; when I do have to develop for it, I try to use cygwin. So to get Windows developers you've got to use something like svn or CVS. And, as he points out, there are a lot more Windows developers out there than developers for any other platform... so you

want to get Windows developers.

Are there any darcs competitors out there for TortoiseSVN?

Moof also echoes Marius Gedminas's point that idea that Trac is something worth keeping an eye on. Trac is dangerously close to becoming "SourceForge in a box", which would be a good answer to most of my suggestions on how to run an OSS project.

In other news, it may be time to go get a blog that allows comments ;).

--titus

25 Nov 2005 (updated 26 Nov 2005 at 03:48 UTC) »
For the sloooooow Thanksgiving holiday... Happy TG, everyone US!

Object-relational stuff, revisited

After I dissed on Jeff Watkin's ORM assumptions & logic, Sean Jensen-Gray staged an intervention & basically told me I was acting like a git. He's undoubtedly right, and I'm continuing that part of the conversation off-line where it belongs.

However, in the name of separating the smoke from the fire, here's some more discussion about ORMs.

First of all, here's my ORM, cucumber2, just so y'all know where I'm coming from. I make no claims about generality or quality or goodness, except to say that I like it & have been using its predecessor for over 4 yrs now. Works great. cucumber2 is some of the nicer bits of concept code I've ever written; it's definitely on my refrigerator. (YMMV...)

Based on my relatively minimal experience with ORMs, then, here are some of my own beliefs about ORM writing in Python.

  • Use "magic".

    Properties, metaclasses, introspection, dynamic code generation, and "under-cover twiddling" can all help make a clean piece of code. Not using them can hurt by making your code over-verbose and cluttering your APIs with information not relevant to the task at hand.

    Document your use, test your use, sure -- but use them.

  • Object-relational impedance mismatch is a big issue.

    Do I need to say more? Just think: how do you encode a collection in a database? (Make sure you're maintaining referential integrity in your answer...) How do you encode an inheritance hierarchy? These are simple examples of a serious mismatch between the relational model and the object model. This is the problem that new ORMs should try to solve, IMO.

  • Don't start out to write a database-generic ORM.

    There's lots of discussion about using database-specific features in the SQL world (although my google-fu is failing me...), so I won't rehash that. I come down solidly on the side of committing yourself to a specific database. I think it's particularly important in the case of an ORM, which may use *very* database-specific stuff to work its magic (e.g. cucumber2 and the PostgreSQL ORDBMS features). Porting this magic between databases is likely to get very hairy & involve lots of additional complexity.

    The attempt to make your ORM generic to multiple databases may well be a specific case of premature optimization (below); it seems like over-reaching oneself by attempting to encompass database-generic issues prior to settling on a good, clean API.

  • Make sure you can still use straight SQL.

    Do you have specialized metainfo that will break SQL queries/inserts/etc. that don't know about this information? If so, this seriously reduces the utility of your databases: you can't use external tools any more, without adding in ORM-specific awareness.

    Even if you can hack this in with triggers and VIEWs, you're adding a whole 'nother layer of complexity. Bad.

  • Premature optimization is the root of all evil. (Hoare via Knuth)

    (Ironically, the first few google hits seem to be dedicated to discussing when this rule doesn't apply...)

    This covers things like caching and cache invalidation code, which in my experience is difficult to handle generically (although possible, esp. if you only allow transaction-wrapped access). Also, SQL query optimization is tough to do in SQL, much less in a layer wrapped around SQL. In many cases, you should consider optimizing by writing app/data-model-specific SELECT statements that integrate with your ORM interface.

Most of all, think of your ORM like an object database. Layering a procedural interface on top of an SQL database isn't building an ORM -- it's building a library that talks to an SQL database. Useful, but probably not new. If you solve a hard problem -- even poorly -- that's new.

For example, one of my absolute requirements: can you determine the class of an "object" (row, tuple, whatever) in the database without using metadata that's stored external to the database (like, say, in your Python object)? I think that's a pretty ORMy requirement, myself, and it helps to not violate condition #4 (straight SQL) above. Another requirement: can you store object hierarchies straightforwardly? Again, seems ORMy to me, but it speaks to the impedance mismatch problem -- it's a tough requirement.

Looking over this list, I think these are all pretty tough requirements. You would be justified in asking "well, why not just use an object database, then?"

There are a few obvious reasons.

  • Requirements. Maybe you have to (or really really prefer to) use an SQL database. Your support staff only understands SQL; your SQL backups are automated; you really like SELECT queries and the command-line interface; or your boss tells you you have to.

  • Language neutrality. Say what you will, but SQL databases are admirably language neutral... suppose you have to access the database from multiple languages, like Java, Python, Ruby, and Perl. Most object databases are language-specific (for obvious reasons...) so you're stuck with a relational DBMS.

  • Maturity. I personally dislike this argument, but: SQL databases like Oracle, PostgreSQL, MySQL, etc. have a long history and the flaws are well known. Not so with ODBMSs.

  • Teamwork. You work with people that only grok SQL. I am sympathetic to this argument, coming from an academic environment with moderately high turnover and people who have relatively little software engineering background.

  • Query performance. If a lot of your data is fundamentally organized in relational ways, I bet your SELECT statements can be heavily optimized in ways that no object database can match.

  • Support. Lots of companies support SQL databases. Not so many support object databases.

OK, so what use is an ORM? I'm assuming anybody who's made it this far is already sold on ORMs, but just in case, here are a couple of my reasons:

  • Impedance mismatch. Object-oriented languages organize data differently than the normal SQL data-model. You really want to be able to take advantage of both. (Or at least I do.)

  • Programming reliability and security. There are a number of mistakes -- some obvious, some not so much -- that can be made by SQL programmers. Hell, you're generating SQL code in another language -- how can this not be problematic? (It's largely solved by using appropriate libraries for SQL access, mind you.)

  • Joins. I don't know about you, but I'm not smart enough to understand LEFT OUTER JOINs. (Could someone else please write a library to do it for me, intelligently?)

You would now be even more justified in calling me somewhat nuts. I have strict requirements for an ORM that are nigh impossible to meet, and lots of reasons why you might be stuck with an SQL database. Yet I've also given a few good reasons to use an ORM. What to do?

My first point: it's not an easy problem. That's why seriously smart people -- much smarter than me -- have thought deeply on the matter and come up with very little.

My second point: it's worth tackling. 'nuff said, here; I think the benefits are obvious.

My third point: I personally guess that there are solutions to most of the problems that I lay out for ORMs, and these solutions lie in the dynamic nature of languages like Python (and probably languages like Ruby and Perl). Certainly I can easily do interesting things in Python that are tremendously difficult to do in Java, although many of these things use the "black magic" of metaclasses.

OK, there's no real conclusion here.

I'm at least minimally satisfied with the approach I've taken in cucumber2. Again, YMMV. Apart from polishing and optimizing the code a bit, I'm thinking about taking a pyparsing-style approach to SELECTs. More on that next time I get the yen to hack on something other than twill ;).

I hope you're at least mildly entertained by my wild-eyed ORM discussion, and I look forward to the horde of disapproving comments. (Luckily, I've disabled comments on this blog, so I won't have to make them public if I don't want to. [0])

cheers,
--titus

p.s. apologies for the weird formatting... advogato *shrug*

[0] I feel compelled to point out that this sentence is a joke.

127 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!