Library of internal sorting functions in C

This is a collection of sorting algorithms, in C originally written circa 1985. Moved to github in 2019.

The code includes implementations for:

All the examples are thoroughly tested using random input generation & assertions, there are no known bugs. I’ve been using these in production code.

The source of these algorithms are classic algorithms and C texts/books. I had to convert the original pseudocode to the ‘C’ programming language to make it actually work (Note: Sedgewick, “Algorithms in C” wasn’t yet published when I wrote this code - way back circa 1985.)

I used the First edition which used pseudocode for all examples). In addition I have introduced some minor modifications to the code over time to improve style, readability, and efficiency and added a unit-test suite to ensure correctness.

All the sorting routines are “internal” sorts, i.e. they assume your sorted data is in memory. When sorting big arrays that do not fit in memory on virtual memory systems, performance may suffer significantly.

Certain current implementations of sorting in C++ standard libraries like STL (the HP/SGI Standard Template Library which is now a part of the GNU free software known as libboost) are comparable in speed to sedgesort.

Note however that the general C/Unix qsort routine is much slower due to its generality (e.g. element compare is done via an indirect function call passed as a function pointer.)

Download the full source tree (sort-all.tgz)

References

Notes

Array-size cut-off (CUTOFF) value is an integer constant for switching from recursive to non-recursive sorting:

/*
 * Interesting historical/technological tidbit:
 *
 * 15 has been found empirically as the optimal cutoff value in 1996
 *
 * 10 years later, in 2006, with computers being very different and
 * much faster, I revisited this constant and found the optimal
 * value to be a close tie between 15 and 16
 *
 * ~7 years later (end of 2013):
 *      - On an Intel i7-4771 with hyper-threading
 *      - Larger caches
 *      - Newer compiler (clang replacing gcc)
 *      - Building with -O3 -fomit-frame-pointer
 * Technology has finally made a difference and could make the switch
 * to a higher CUTOFF value.
 *
 * Here are the run times sorting a 100k element array:
 *  -DCUTOFF=10:    sedgesort:      5148 microsec.
 *  -DCUTOFF=15:    sedgesort:      5032 microsec.
 *  -DCUTOFF=16:    sedgesort:      4998 microsec.
 *  -DCUTOFF=20:    sedgesort:      4926 microsec.
 *  -DCUTOFF=25:    sedgesort:      4880 microsec.
 *  -DCUTOFF=30:    sedgesort:      4830 microsec.
 *  -DCUTOFF=40:    sedgesort:      4770 microsec.
 *  -DCUTOFF=50:    sedgesort:      4746 microsec.  (minima)
 *  -DCUTOFF=60:    sedgesort:      4770 microsec.
 */
#ifndef CUTOFF
#  define CUTOFF 50
#endif

Brief descriptions of the sort library source files:

How To ?

Build:
$ make
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 -c sorttest.c
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 -c sorted.c
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 -c insort.c
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 -c shellsort.c
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 -c gamasort.c
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 -c quicksort.c
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 -c quickersort.c
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 -c sedgesort.c
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 -c heapsort.c
ar rv sort.a sorted.o insort.o shellsort.o gamasort.o quicksort.o quickersort.o sedgesort.o heapsort.o
ar: creating sort.a
a - sorted.o
a - insort.o
a - shellsort.o
a - gamasort.o
a - quicksort.o
a - quickersort.o
a - sedgesort.o
a - heapsort.o
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 sorttest.o sort.a -o sorttest -lm
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 -c bstest.c
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 -c bsearch.c
/usr/bin/clang -O3 -fomit-frame-pointer -DCUTOFF=50 bstest.o bsearch.o -o bstest
Test:
$ make test
=== checking sort routines
Benchmarking & verifying sort funcs on 100000 element array
insort:           998472 microsec.
shellsort:         10779 microsec.
heapsort:           7521 microsec.
gamasort:           5327 microsec.
quicksort:          5100 microsec.
quickersort:        5432 microsec.
sedgesort:          4486 microsec.
=== OK
=== checking binary-search routines
=== OK

Licensing:

This code is released under the BSD (3 clause) open source licence.

– Ariel Faigon