Efficient Doesn't Equal Performant
This is a bit of a rant, I guess. Sorry about that.
A few years back I needed a binary heap, and I needed one that was fast and thread safe. So I wrote Pileup.
There are other heaps for Common Lisp, and some of them support operations Pileup doesn't implement out of the box, and all of them claim to be efficient.
...and I'm sure that algorithmically they are. However, constant factors matter more often than you might think. I'll single out CL-HEAP primarily because it has such an authoritative name. :)
A tiny benchmark:
(defvar *1M-numbers* (loop repeat 100000 collect (random most-positive-fixnum))) (defvar *4K-numbers* (loop repeat 4000 collect (random most-positive-fixnum))) (defun insert-and-pop (insert pop heap things) (declare (function insert pop)) (dolist (thing things) (funcall insert heap thing)) (loop for thing = (funcall pop heap) while thing)) (defun make-insert-and-pop (make insert pop things) (declare (function make)) (insert-and-pop insert pop (funcall make) things)) (defun time-insert-and-pop (insert pop heap things) ;; Time 4 runs. (time (loop repeat 4 do (insert-and-pop insert pop heap things))) t) (defun time-make-insert-and-pop (make insert pop things) ;; Time 1000 runs. (time (loop repeat 1000 do (make-insert-and-pop make insert pop things))) t) (defun cl-heap-make () (make-instance 'cl-heap:priority-queue)) (defun cl-heap-insert (heap thing) (cl-heap:enqueue heap thing thing)) (defun cl-heap-pop (heap) (cl-heap:dequeue heap)) (defun pileup-make () (pileup:make-heap #'
Results: (median of three runs for each);;; CL-HEAP: insert and pop 1M numbers x 4 into a single heap Evaluation took: 6.077 seconds of real time 6.054307 seconds of total run time (5.871448 user, 0.182859 system) [ Run times consist of 0.300 seconds GC time, and 5.755 seconds non-GC time. ] 99.62% CPU 15,758,370,862 processor cycles 208,389,696 bytes consed ;;; PILEUP: insert and pop 1M numbers x 4 into a single heap Evaluation took: 0.409 seconds of real time 0.410249 seconds of total run time (0.409089 user, 0.001160 system) 100.24% CPU 1,060,531,810 processor cycles 3,053,296 bytes consed ;;; CL-HEAP: make 1K heaps, insert and pop 4K numbers Evaluation took: 38.221 seconds of real time 38.051652 seconds of total run time (37.799509 user, 0.252143 system) [ Run times consist of 0.433 seconds GC time, and 37.619 seconds non-GC time. ] 99.56% CPU 99,125,251,225 processor cycles 1,940,254,144 bytes consed ;;; PILEUP: make 1K heaps, insert and pop 4K numbers Evaluation took: 3.468 seconds of real time 3.476932 seconds of total run time (3.453248 user, 0.023684 system) [ Run times consist of 0.021 seconds GC time, and 3.456 seconds non-GC time. ] 100.26% CPU 8,991,845,681 processor cycles 98,563,520 bytes consed
(I was also going to compare parallel performance, but CL-HEAP doesn't appear to be thread-safe, so...)
This is not to disparage CL-HEAP: it supports things which Pileup doesn't, but it clearly isn't written with constant factors in mind, and this shows.
Constant factors matter.
(Admittedly, I tested this only on SBCL, and it might turn out that CL-HEAP does a lot better -- and Pileup a lot worse -- on some other implementation. This does not alter my main contention that you ignore constant factors at your own peril.)