6 Apr 2006 nconway   » (Master)

I'd like to highlight Tom Lane's recent commit that fixes a major defect in the implementation of domains in PostgreSQL. Domains can have CHECK or NOT NULL constraints, but prior to Tom's patch, these constraints were not enforced by PL/PgSQL, or as the return value of a procedural language function. (That meant you could have values of a domain type that violated the domain's own constraints!) This has been a known problem for quite some time, but it was somewhat tricky to fix.

In Postgres, each type has an associated "input function" that takes a string and produces a value of the type. Prior to Tom's commit, a domain type's input function was just the input function of its base type. As a result, just invoking a type's input function (which is done in several places throughout the backend) wasn't enough to check the constraints on a value of a domain type—you also had to explicitly lookup any associated domain constraints and check them. So in all the places where we were invoking input functions we'd need to add some additional code to explicitly check domain constraints. Needless to say, that would be pretty ugly -- it's just a few additional function calls, but it's really not something ought to be doing at every callsite of a type's input function.

Worse still, looking up the constraints associated with a domain is relatively expensive (it requires a non-syscache'd catalog lookup). To actually evaluate a CHECK constraint you need to evaluate an expression, which requires instantiating a bunch of executor machinery, which is also not that cheap. So in all the places where we'd need to add checks of domain constraints, we'd also need to think about how to efficiently load and cache the domain constraints and executor machinery, and when to invalidate/release them.

I did some work to add domain constraint checking to the return value in PL/PgSQL. I never applied the patch, partly because there were some implementation details that were tricky to resolve, but mostly because it just seemed like the wrong approach.

Tom's fix is much cleaner: by providing a separate input function for all domain types and doing the constraint checking there, we're guaranteed to check domain constraints at the appropriate time, without the need to clutter each call-site of a type input function ([1], [2]). It would be worthwhile to investigate whether this results in a performance regression, though: there's no easy way to cache the executor machinery needed to evaluate a CHECK constraint in this design, whereas the prior design allowed each call-site to implement its own EState caching.

Nice work, Tom!

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!