Older blog entries for nconway (starting at number 23)

3 Mar 2006 (updated 3 Mar 2006 at 15:31 UTC) »

A brief followup to my earlier post on random generation of SQL queries for testing purposes: Don Slutz from MSR has an interesting paper called "Massive Stochastic Testing of SQL" that describes a stochastic SQL query generator called RAGS. One of the problems with randomized testing that I noted earlier is that it is difficult to distinguish between successes and failures: if the backend crashes while executing a query it is very likely a defect of some kind, but if no crash occurs, how do we know that the backend has produced the correct result set? Slutz writes:

If a SQL Select executes without errors, there is no easy method to validate the returned values by observing only the values, the query, and the database state. Our approach is to simply execute the same query on multiple vendors' DBMSs and then compare the results.

As the author notes, this isn't perfect:

The use of database systems from different vendors for output validation proved to be extremely useful for the SQL common to many vendors. The down side is that the common SQL subset is relatively small and changes with each release. We also found the differences in NULL and character string handling and numeric type coercion in expressions to be particularly problematic (these are also portability issues).

The paper's approach is to randomly generate SQL queries using an SQL grammar. The grammar defines the set of legal SQL statements, so it can be used to generate members from that set. For example, a trivial SELECT grammar might be:

    SelectStmt: SELECT t_list FROM relation

The SQL generator would generate values for the "t_list" and "relation" randomly: it would generate a randomly-sized list of random target list expressions for "t_list", and a random relation name for "relation". The paper's SQL generator allows for a few simple configuration settings: generate between n and m joins, generate between x and y elements in the target list, and so on. William McKeeman's "Differential Testing for Software" explores some similar ideas for doing stochastic testing of C compilers. Both papers have been discussed by other bloggers ([1], [2]) in the past.

One problem with this approach is that it often generates unrealistic queries. Using an SQL grammar ensures the randomly-generated queries will be well-formed, but it doesn't provide any guarantees about whether the queries will be sensible, or bear any relation to the sorts of queries users are actually running. The relatively coarse-grained tuning knobs provided by RAGS also don't allow test suites to be written to validate the behavior of the database for specific classes of queries: you can't generate a specialized set of SQL queries to test a particular part of the query optimizer, for example. It would be nice to be able to take the query workload from a production database, and write a high-level constraint that captures the notion of "queries that look like that." I'm still hopeful a DSL could provide a nice solution to both problems, but I haven't had a chance to really think about it yet.

Mark Shuttleworth seems fairly pleased with Postgres. Money quote:

Postgres is a truly awesome database. When we started working on Launchpad I wasn't sure if it would be up to the job. I was so wrong. It's been robust, fast, and *professional* in every regard.
6 Feb 2006 (updated 6 Feb 2006 at 07:14 UTC) »

Brain dump: I've heard of QuickCheck before, but I only found time to look at it in some depth earlier today. It's a very simple but powerful tool: rather than writing unit test cases by hand, the tester just specifies a set of constraints that the function being tested must satisfy. QuickCheck then randomly generates test cases and checks that the constraints hold. A trivial example of a constraint given by in the 2000 QuickCheck paper is for an insertion function into ordered lists:


prop_Insert :: Int -> [Int] -> Property
prop_Insert x xs =
ordered xs ==> ordered (insert x xs)


That is: given any integer x and any ordered list of integers xs, insert x xs should produce an ordered list. QuickCheck takes care of generating the necessary random integers automatically. The programmer can view the distribution of random values generated by QuickCheck, or provide their own data generator.

I think it would be interesting to apply this to PostgreSQL. For example, I find it frustrating to try to manually test the impact of changes to the query optimizer: it's difficult to determine whether a local change to the optimizer is correct for all possible query trees. Worse yet, a planner change might optimize some queries and pessimize others — unless the developer realizes this while implementing the change, we often don't hear about these sorts of performance regressions until the beta period or even later.

It might be interesting to try to define a domain-specific language that would allow you to construct random SQL queries that satisfy certain high-level constraints, and then verify that the backend executes those queries correctly. For example, a statement in the DSL might be "for all SQL queries q such that q contains 0 < n < 10 joins, check that q is executed without error." Checking that queries can be executed without error is fairly trivial; perhaps we could check more advanced properties, like "executes within a certain amount of time" or "produces a result set that satisfies the constraint: [...]".

That said, there are plenty of low-hanging fruit when it comes to improving the state of testing for Postgres. For example, the ability to generate arbitrary n-relation joins of certain kinds (e.g. star schema) would be useful. Another simple improvement would be to convert the TPC-D query workload into something that could be used to isolate changes in the optimizer's performance for that set of (important) queries.

Via Zack, there's an interesting entry on Xooglers from an engineer on the Google AdWords team, about the experience of using MySQL initially and then switching to an unnamed commercial database. (Of course, there is the usual uninformed proselytizing in the comments: "you should have used x, idiots!")

Two impressions from the story:

  • MySQL works just fine for a lot of applications. While admittedly I often speak with pro-Postgres people, there's a segment of the user base that has come to regard using MySQL in almost any situation as something to be embarrassed about. I've never understood that attitude (or why SQLite is not regarded in the same way): there are many applications for which MySQL is a good choice or even the best choice. AdWords may well be one of them—I won't claim to know.
  • Technical decisions made at the insistence of management often end badly:
    After AdWords launched, Jane, the ads group manager, decided that now would be a good time to switch over to a "real" database. "Real" is one of those words that Doug ought to add to his list of words. It means "expensive". Many managers seem to have this idea that it is invariably true that you get what you pay for, and that therefore nothing that is available for free can possibly be any good....

    To make a long story short, it was an unmitigated disaster.

I noticed this interesting story on FeedLounge moving to Postgres, from MySQL.

I recently stumbled upon an online copy of Claude Shannon's groundbreaking 1950 paper, "Programming a Computer for Playing Chess". It's definitely worth a read -- it's amazing to think that Shannon had already tackled this sort of problem as early as 1950.

16 Oct 2005 (updated 16 Oct 2005 at 02:23 UTC) »

Java

Recently I've had to use Java for some assignments at school. It's not as unpleasant as I recalled (generics in Java 5.0 help a lot), but I continue to be annoyed by Java's checked exceptions. I realized why: they tend to disrupt the way that I write code. I'll often write the code for some method to sketch out how to implement something, without necessarily wanting to get all the details right the first time through. Clean error handling can be tricky to get right; before deciding how to handle errors, I often like to get some experience using the class elsewhere in the project. Java makes this difficult: I either need to litter my code with empty try ... catch blocks, or else clutter up my method signatures with ugly exception specifications. I like the fact that Java forces the programmer to handle error conditions, but it gets the timing all wrong: I'm eventually going to handle errors, but Java forces me to worry about it much earlier than I would prefer to.

LISTEN/NOTIFY

For 8.2, I've decided that I definitely want to fix the issues that LISTEN/NOTIFY have with MVCC bloat (since each NOTIFY updates a row in pg_listener, applications that use notifications frequently need to VACUUM the system catalogs frequently to clean up the resulting MVCC bloat; worse, pg_listener is always sequentially scanned, so performance degrades linearly). This is particularly important since Slony I uses notifications, so a lot of people will be using them whether they know they are or not.

Fixing the problem basically requires rewriting the way that notifications are stored; the idea is to store notifications in shared memory rather than in the system catalogs. The problem is that there is only a static amount of shared memory in which to store a potentially unbounded number of notifications, so it's not an easy problem to solve. I sent a few ideas about how to fix this to the hackers list, but Tom pointed out a few problems, so it is back to the drawing board. Alvaro and I have been discussing how to use the SLRU stuff to store notifications in shared memory with a disk backing store, but I haven't found a design I'm happy with. Hopefully more on this in a few days...

30 Sep 2005 (updated 30 Sep 2005 at 10:27 UTC) »

VLDB

I was browsing the program from this year's VLDB. I noticed a few papers that I thought were interesting: "Getting Priorities Straight: Improving Linux Support for Database I/O" (PDF), which is worth a read -- the authors are apparently collaborating with MySQL AB to improve MySQL's performance under Linux, including modifying MySQL to use AIO. I also liked two papers (one by Microsoft and one by IBM) on XML integration / XQuery optimization issues.

20 Jul 2005 (updated 20 Jul 2005 at 02:23 UTC) »
Robert: Most of the issues the Coverity tool found were actually in pg_dump and ecpg; it actually only found a single bug in the Postgres backend, which I'm somewhat embarrassed to admit I introduced during the 8.0 development cycle. Which is both impressive and suspicious -- a single bug in 250,000-odd lines of C is too good to be believed. As it turns out, there is a good reason for this: the Stanford checker was already run against the Postgres tree a few year ago (in the 7.3 timeframe, IIRC), and the bugs it identified were fixed. Since the Coverity tool is an improved, commercialized version of the original Stanford checker, it's not too surprising that the tool didn't find very many new bugs in the backend.

Another factor is that the way that memory management is done in Postgres makes static analysis a little more difficult. For example, given the code:

void foo(char *arg)
{
    char *ptr = strdup(arg);
    if (!ptr)
        return;
    
    printf("Hello, world: %s\n", ptr);
}

it is relatively easy to determine via static analysis that the strdup leaks memory when ptr goes out of scope. However, malloc and friends are rarely used directly in the backend; we use region-based memory allocation instead. In other words, each allocation is done in a "memory context"; an allocation can be manually released via pfree, or all the allocations in a context can be released at once (via MemoryContextReset and related functions). Contexts are arranged into a hierarchy -- when a context is reset, all of its child contexts are also reset. This significantly simplifies memory management and error recovery: when we're done processing a query, we can just reset the memory context associated with that query, and we can be pretty confident all associated memory will be released. If a transaction is aborted, we can longjmp back to the abort error handler and then reset the transaction's context -- this cleans up all memory allocated during the transaction.

On the other hand, this technique makes static analysis of memory leaks a little more difficult:

void foo(char *arg)
{
    char *ptr = pstrdup(arg);
    printf("Hello, world: %s\n", ptr);
}

pstrdup allocates memory in the CurrentMemoryContext; when that context is reset or deleted, the memory will be released. Some variant of the above code could be perfectly valid: if the code knows that CurrentMemoryContext will always be a short-lived memory context (i.e. one that will be reset "soon"), it needn't bother explicitely freeing all its allocations. This improves readability (by not cluttering the code with pointless frees) as well as performance (it is faster to reset an entire context at once than to release each individual allocation).

So in some sense, memory leaks are impossible: allocated memory will never be unrecoverably lost until process exit. However, you can still have problems -- this bug in 8.0 is a typical example of the kinds of memory leaks you can get with region-based allocation. This code is rewriting a table, doing essentially:

for each tuple in the old table
    evaluate any new default expressions
    recheck any applicable constraints
    insert the tuple in the new table

We need to do this when adding a column with a non-NULL default value to a table, for example. The problem was that when evaluating default expressions, we invoked various functions that performed allocations in the CurrentMemoryContext. These functions effectively assumed that CurrentMemoryContext was short-lived -- but in this case, we didn't reset the context for each iteration of the loop. Again, there is no "memory leak" here, strictly speaking -- the CurrentMemoryContext would eventually be reset. But if the loop iterated a few hundred million times on a large table, we would quickly exhaust all available memory.

So how do you detect these sorts of pseudo-leaks via static analysis? I don't really see an effective way, although it's an interesting problem.

Anyway, my point is that using metrics like "tool X found Y bugs per Z lines of code" as an indication of software quality is pretty dubious -- and it's even more dubious to try to use this metric to compare the quality of two different software products.

27 Jun 2005 (updated 27 Jun 2005 at 08:31 UTC) »

Postgres: We're making steady progress toward the 8.1 feature freeze. Perhaps I'll jinx it by writing this, but it seems that people are paying more attention to the feature freeze date (July 1st) this time around. There is the usual flood of large patches as the freeze date approaches, but it seems a bit less hectic than it has been in years past.

Slony: I don't have anything I need to get in by the freeze, although there are a few patches I need to review. I'm spending most of my time these days working on Slony II (the synchronous multi-master database replication system for Postgres that I'm working on with a few other folks). We haven't been talking much about Slony II -- not because there isn't work being done on it, but I think there's a consensus that we don't want to hype something that hasn't even been completely designed yet, let alone implemented. Still, I'm hoping to get a prototype of the system finished by the end of July that can do multi-master replication of arbitrary DML in serializable isolation level. Hopefully the prototype will also include some basic recovery functionality as well ("recovery" is how we synchronize the state of a joining node with the current state of the cluster). Once the system gets a little closer to reality, I'll post some more about the design of the system.

Robert: Yeah, the Postgres-R (SI) work is one of the papers we have been considering for Slony II (there's a copy you can get here that doesn't require IEEE membership). The current prototype we're working on will be fairly similar to what Prof. Kemme proposes, although I don't think we've decided that this is definitely the right way to go.

14 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!