More bit-twiddling intrinsics
|32-bit w/ lzcnt||4.90||-||1.46||-|
|64-bit w/ lzcnt||6.77||-||3.71||-|
"w/ lzcnt" in the table means the numbers are using AMD's LZCNT (count leading zeros) instruction which is part of SSE4a.
The SPARC intrinsics need a hardware implementation of the POPC instruction.
Yet I haven't found a real-world application that uses these methods extensively (including bitCount), but if anyone knows one, please let me know.