Older blog entries for nconway (starting at number 13)

13 May 2005 (updated 13 May 2005 at 06:09 UTC) »

Hash indexes

I think it is somewhat common knowledge, at least among Postgres geeks, that the current implementation of on-disk linear hash indexes in PG is pretty poor. Some of the problems include:

no write-ahead logging — if a system crash occurs, the index state may be inconsistent, so you'll need to REINDEX
poor bulk-loading performance — creating a hash index on a few gigabytes of data takes a long time
doesn't support unique indexes or multi-column indexes

But more importantly, the current hash index code doesn't outperform our b+-tree implementation even for scalar equality (e.g. WHERE x = 'foo'). So there hasn't been much motivation for folks to hack on hash indexes, as few people are actually using them.

In theory, though, hash indexes could be useful: equality scans are very common, and a hash index should be faster than a b+-tree for these queries at least in non-concurrent situations (since a b+-tree needs to navigate the internal nodes of the tree for a few levels before reaching the leaf level; a hash index can jump right to the appropriate hash bucket).

This topic was raised on the pgsql-general list, and a pretty interesting discussion ensued. We came up with two simple ways to improve hash indexes:

at present, the content of a hash bucket is unordered. That means the index can use the hash to select the right bucket to scan, but needs to do a linear scan over all the bucket's pages once it gets there, running the equality function of the index's operator class for each the entry in the bucket.

It would be a lot faster to just binary search within a page, but to do that we need to either define an ordering over the index keys (which we may not be able to do for a user-defined type), or sort the keys by their hash value. If we do the latter, we probably need to store the full hash value of each key in the index. Storing the full hash value is useful for other reasons, as well: for example, if an entry's hash value does not match the hash of the scankey, we needn't evaluate the opclass's equality function, since this tuple cannot match the scan.
if we're going to be storing the hash of each key in the index, it would sometimes be a good idea to only store the hash of the key, not the key's value itself. To avoid hash collisions, we'll need to recheck the key value against the heap tuple, but we need to do that anyway (PostgreSQL only stores MVCC information in the heap, so we need to check that the tuple pointed-to by the index entry actually exists).

Which was all well and good in theory, but after implementing it, it turns out that I've yet to find a case in which the performance is noticeably improved :-\ I could definitely construct a situation where the patch would be a win (e.g. a user-defined type with a complex equality operator, or an index on a very wide text field), but it is frustrating that it doesn't appear to be a win in the common case (e.g. a hash index on an int4 field with a reasonably random distribution of values).

I'm still trying to figure out if there's merit to pursuing these improvements, and if so, why I haven't seen the performance improvement I expected. Anyway, this will teach me to look at profiling data before sitting down to code...

17 Mar 2005 (updated 17 Mar 2005 at 12:26 UTC) »

grep on steroids

Recently, I've been spending some time poking around in the internals of the PostgreSQL query planner. One of the projects I've been thinking about is fixing a long-standing ugly part of the planner: the fact that it scribbles on its input.

(Context: the planner takes a Query, which is essentially an annotated, rewritten abstract syntax tree describing the query to be executed. It produces a Plan, which describes the particular way to execute the query that the planner believes to be most efficient. The problem is that the planner modifies various parts of the input Query. This is a problem for a few reasons; besides being ugly, (a) it doesn't clearly distinguish between the parser's input and its working state (b) it makes it impossible to pass the same Query to the planner multiple times without copying it each time. I'm a little embarrassed to admit we actually do make extra copies of a Query in a few places in the backend, for precisely this reason.)

So, the first thing I wanted to determine was: in what situations does the planner modify one of the fields of the input Query?

Um, so it seems this isn't a trivial problem to solve. The planner is about 30,000LOC, and the input Query is used all over the place. I also want to find indirect assignments to the input Query — for example, if the planner passes a pointer to a field of the input Query to a function, and the function then procedes to write through the pointer. (In this particular example, I have a pretty good idea where the most egregious offenders are, so I can probably solve the problem well-enough via simple text search using grep, glimpse, and the like — but a general-case solution would be interesting, I think.)

It might be plausible to do this via gdb, valgrind or the like (waiting until a write is actually made to the input Query at runtime and noting the call site then). But this only catches the modifications that happen to be made by a particular invocation of the planner, not all the modifications that might occur. It is also a hack: this information is statically derivable from the source code, so a static analysis seems much nicer than solving the problem at runtime.

Text search that does not incorporate knowledge of the syntax of the source code simply doesn't cut it. One example among many is:

void some_function(SomeType *st) {
    st->field = xxx;
}
Plan *planner(Query *query) {
    some_function(&(query->some_field));
}

Solving the problem in the general case also requires considering aliasing; alias analysis (in C anyway) is a rather complex problem to solve effectively, even in a compiler.

Besides this, there are interesting queries about source code that can't easily be expressed via searching for patterns in text — "show me the functions where function X is called before function Y has been called", "show me the functions that are only invoked by other functions defined in the same file (i.e. these would be candidates for being marked static)", and so on. ISTM a syntax-aware source code search tool like this would be an interesting thing to write (of course, if anyone knows of an F/OSS tool that already does this, let me know).

23 Feb 2005 »

This thread on llvm-dev is worth reading if you're interested in compiler internals.

18 Feb 2005 (updated 19 Feb 2005 at 11:42 UTC) »

I've been thinking about static analysis for finding bugs off and on for the past 18 months or so; recently, I've been looking for a good open source static analysis tool. Unless I've managed to miss it, ISTM there isn't one.

Uno is the closest I've found, but it is pretty unpolished, and I don't believe it is DFSG free.

sparse, Linus' checker, may or may not be cool; I've tried to see what it's capable of, but wasn't able to make it catch anything more significant than minor stylistic errors in C code (e.g. extern function definitions, 0 used in a pointer context (rather than NULL), that sort of thing). (Side note: sparse doesn't even have a website, and it's primarily available via bk. Does Linus not want people to use his software?) I'll definitely take a closer look at this, anyway.

There are some more academic tools — like Uno only even less practical). There's also Splint, but last I tried it, it emitted way too many bogus error reports, and required tons of annotations to be of any use.

Some random thoughts about the design of an open source static analysis tool:

A tool that hides a handful of legitimate error reports within thousands of bogus ones is essentially useless. Given the choice, it is better to miss a few problems than to warn the user about everything that might be bogus — false positives are bad.
- A reasonable substitute would be some effective means of sorting error reports by their likelyhood of legitimacy; if the tool generates thousands of bogus errors but places the legitimate errors at the top of the list, I'd be willing to live with it.
It ought to be easy to check an arbitrary base of code. That means understanding C99 plus all the GNU extensions, and providing an easy way to run the checker on some source (while getting header #includes right, running the necessary tools to generate derived files, and so on). Perhaps the easiest way to do that is have the checker mimick the standard $CC command-line arguments; then the user could run the checker via make CC=checker.
- This also means no annotations. They are ugly, they tie the source into one specific analysis tool, and they are labour intensive; the whole point is to find bugs with the minimum of labour by the programmer.
It ought to be possible to write user-defined extensions to the checker, to check domain-specific properties of the source. I've got no problem with annotations in this context — that's a sensible way to inform your checker extension about domain-specific properties.
The theory behind Dawson Engler's work on MC is a good place to start; it is more or less the start of the art, AFAIK. Unfortunately the tool they developed was never released publicly (from what I've heard it was somewhat of a kludge anyway, implementation-wise), and Engler's now commercialized the research at Coverity.
ckit might be worth using. Countless people have implemented C compiler frontends in the past, so it would be nice to avoid needing to reinvent that particular wheel.

Speaking of tools for finding bugs, I've got to find some time to make valgrind understand region-based memory allocation.

22 Jan 2005 »

Vacation

I took about a month off work. I was in Perth for about two weeks to celebrate Christmas with my aunt's family, and then in Cairns for about 10 days, doing some scuba diving with a friend who was over from Canada.

PostgreSQL

Started back at work last Monday. 8.0.0 got released, which is great -- this release has a ton of new functionality that I'm really happy about.

The tree is now open for 8.1 work, so I got a chance to check in some stuff that's been sitting on my hard drive for a while. Sped up rtree scan performance by about 10%; I have similar patches for GiST which I'll commit soon. The GiST stuff also overhauls memory management: GiST user-provided functions will now always be invoked in a short-lived memory context, so people implementing GiST-based indexes won't need to worry about freeing palloc'ed memory. One of the lessons of working on the PG source: region-based memory allocation is a Good Thing.

While cleaning up various things in PL/PgSQL (mostly memory management related), I noticed a buffer overrun in the parsing of refcursors. Patched that for 7.4 and 8.0.

I took a look at adding support for GCC's profile-guided optimization to the build system. I'm a little confused -- why don't more projects take advantage of this? Particularly when, say, building RPM packages, it would make sense to trade some extra compile-time for a few % improvement in runtime performance. On the other hand, I ran into some problems actually using the PGO support (e.g. this), so perhaps that's one reason PGO support hasn't (AFAICS) taken off.

robocoder: Thanks for mentioning the pending patent on ARC. Unfortunately, that came as quite a surprise. I passed on the bad news to the pgsql-hackers list, which started a spirited discussion of the topic. I'm not sure what the resolution to the problem is going to be; personally I think we ought to replace ARC with a simple LRU scheme in 8.0.1, and worry about a better, unencumbered replacement for 8.1. But in any case I'm glad we found out about the problem sooner rather than later.

Books

Just started reading Paul Graham's Hackers and Painters, but I'm really enjoying it so far. I also have Conrad Black's biography of FDR to start (1300 pages, yum).

20 Nov 2004 »

An interesting statistic from the Economist:

One of the best statistics of the campaign is that people worth $1m-10m supported Mr Bush by a 63-37% margin, whereas those worth more than $10m favoured Mr Kerry 59-41%.

4 Nov 2004 »

Robert Kagan's article Power and Weakness in Policy Review was written in 2002, but it's still a fascinating read. Choice quote:

Today's transatlantic problem, in short, is not a George Bush problem. It is a power problem. American military strength has produced a propensity to use that strength. Europe's military weakness has produced a perfectly understandable aversion to the exercise of military power. Indeed, it has produced a powerful European interest in inhabiting a world where strength doesn't matter, where international law and international institutions predominate, where unilateral action by powerful nations is forbidden, where all nations regardless of their strength have equal rights and are equally protected by commonly agreed-upon international rules of behavior. Europeans have a deep interest in devaluing and eventually eradicating the brutal laws of an anarchic, Hobbesian world where power is the ultimate determinant of national security and success.

28 Oct 2004 »

Random Thought

Nothing can be more fallacious than to found our political calculations on arithmetical principles. Sixty or seventy men may be more properly trusted with a given degree of power than six or seven. But it does not follow that six or seven hundred would be proportionably a better depositary. And if we carry on the supposition to six or seven thousand, the whole reasoning ought to be reversed. The truth is, that in all cases a certain number at least seems to be necessary to secure the benefits of free consultation and discussion, and to guard against too easy a combination for improper purposes; as, on the other hand, the number ought at most to be kept within a certain limit, in order to avoid the confusion and intemperance of a multitude. In all very numerous assemblies, of whatever character composed, passion never fails to wrest the sceptre from reason. Had every Athenian citizen been a Socrates, every Athenian assembly would still have been a mob.

-- James Madison, The Federalist #55

13 Oct 2004 (updated 13 Oct 2004 at 08:24 UTC) »

There was an interesting thread on the GCC development list about what kind of optimizations can legally be performed on "explicit storage" (e.g. malloc in C, operator new in C++). Various folks raised concerns about how this changes programmer expections and whether it is allowed by the C or C++ standards. Interestingly, Chris Lattner pointed out that LLVM actually implements this optimization, at least for malloc (as usual, C++ makes things more complicated, but even then LLVM could theoretically perform the optimization at link-time).

8 Oct 2004 »

Since my last blog entry, I:

Finished my summer internship
Spent a few weeks in Toronto
Decided that I wanted to take twelve months off university.
Took a twelve month contract to work full-time on PostgreSQL for Fujitsu Australia Software Technologies.
Moved to Sydney, Australia

So, a lot is new :)

4 older entries...