Yeah, apparently what I had first thought was in fact the right idea. I'm talking about my shared-row-locks project. Due to some misunderstanding on my part, I figured that the simpler idea was wrong; so I tried a lot of other things, and of course, they were also wrong. So eventually I came back and tried the first thing again, and discovered that it works as expected (by me at least). So I posted the patch, and promptly received a comment from Tom which made me notice a gross mistake. Easily solved, but gross anyway ;-)
performance measure stupidity
I ran some performance testing to verify that my patch won't make people too angry at me. I was terrified to discover that it had dropped by 25% or so in pgbench. I spent an hour and a half looking at the patch searching for the culprit (I didn't want to compile with profiling enabled because my machine is somewhat slow) ... And then I realized that I had compiled the whole backend/access/heap directory with -O0. Recompiled, reran pgbench and now I see no measurable difference between pristine sources and my tree. That's fortunate at least. I still have to see how will the lock-spilling code perform.