30 Dec 2011 nikodemus   » (Journeyer)

Holiday Hack: Bit Position

Logically speaking, POSITION with trivial :KEY and :TEST arguments should be much faster on bit-vectors than on simple vectors: the system should be able to pull one words worth of bits out of the vector at a single go, check if any are set (or unset), and if so locate the one we're interested in -- else going on to grab the next word.

Practically speaking, no-one who needed fast POSITION on bit-vectors seems to have cared enough to implement it, and so until yesterday (1.0.54.101) SBCL painstakingly pulled things one bit at a time from the vector, creating a lot of unnecessary memory traffic and branches.

How much of a difference does this make? I think the technical term is "quite a bit of a difference." See here for the benchmark results. First chart is from the new implementation, second from the new one. Other calls to POSITION are included for comparison: ones prefixed with generic- all go through the full generic POSITION, while the others know the type of the sequence at the call-site, and are able to sidestep a few things.

So, if you at some point considered using bit-vectors, but decided against them because POSITION wasn't up to snuff, now might be a good time to revisit that decision.

Gory details at the end of src/code/bit-bash.lisp, full story (including how the system dispatches to the specialized version) best read from git.

Also, if you're looking for an SBCL project for next year, consider the following:

  • Using a similar strategy for POSITION on base-strings: on a 64-bit system one memory read will net you 8 base-chars.
  • Using similar strategy for POSITION on all vectors with element-type width of half-word or less.
  • Improving the performance of the generic POSITION for other cases, using eg. specialized out-of-line versions.

Happy Hacking and New Year!

Syndicated 2011-12-30 09:35:55 from Nikodemus Siivola

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!