, 1 min read
Sorting is fast and useful
I like to sort things. If you should learn one thing about Computer Science is that sorting is fast and useful.
Here’s a little example. You want to check quickly whether an integer belongs to a set. Maybe you want to determine whether a userID is valid. The solutions:
- Use a hash table. Java programmers use the HashSet class.
- Use a tree structure such as a red-black tree or a B-tree. Java programmers use the TreeSet class.
- If your set of integers changes rarely, you can sort it, and then try to locate integers using binary search.
I wrote a Java benchmark to compare the three solutions:
Binary search over a sorted array is a only 10% slower than the HashSet. Yet, the sorted array uses half the memory. Hence, using a sorted array is the clear winner for this problem.
If you think that’s a little bit silly, consider that column-oriented DBMSes like Vertica use binary search over sorted columns as an indexing technique.
Code: Source code posted on my blog is available from a github repository.