Daniel Lemire's blog

, 1 min read

Sorting is fast and useful

I like to sort things. If you should learn one thing about Computer Science is that sorting is fast and useful.

Here’s a little example. You want to check quickly whether an integer belongs to a set. Maybe you want to determine whether a userID is valid. The solutions:

I wrote a Java benchmark to compare the three solutions:

Binary search over a sorted array is a only 10% slower than the HashSet. Yet, the sorted array uses half the memory. Hence, using a sorted array is the clear winner for this problem.

If you think that’s a little bit silly, consider that column-oriented DBMSes like Vertica use binary search over sorted columns as an indexing technique.

Code: Source code posted on my blog is available from a github repository.