It’s a little surprising to look at textbook algorithms and data structures and see which ones have retained popularity in the real world. A number of sorting algorithms seem also to have fallen by the wayside, to be mentioned in programming interviews and then never again.
With hashes I’m not sure if it’s a big deal or just a lot of usage of the small case, say an associative array with 5-10 members.
I think it is obvious why search trees not seen often. For a search tree you need a total ordering on all keys which implies either that all keys are of the same type or you extend your ordering to an ordering of types which is only a half-order in most (all?) type systems. So for a mapping between arbitrary keys to values there is no obvious way to implement that in search tree.
Whereas a hasmap “only” needs a hashing function (that’s unique and uniformly distributed and so on). So besides having that function, no restrictions are placed on the set of all keys.
In a search tree, the set of all keys need to have a total ordering.
That’s a good point but how often do you have maps where keys have varied types? In JavaScript, all keys to an Object must be a string, and there is a natural (lexicographical) ordering for strings. JavaScript’s Set type supports varied types however. In Go, you have to specify the type, and the type has to implement comparable but may not be ordered. Python natively supports hash maps with varied types.
Not only varied types but compound types also. If you have struct/tuple of an int and a string as keys you have to implement your own compare function that compares first by the int and then by the string (or the other way around) were both ways to sort might not be “natural”. You still need total ordering.
Using hash maps, just use the hash functions of your elements, bitshift and xor them. If there is no natural way to sort your items then there is no need to traverse them in some order and then you don’t have to write some sorting functions.
In databases where b-trees are used as indexes, even for compound keys you need to define an order and most other Indexable database types are naturally orderable.
Of course many things can go wrong in writing a hash function but given that most environments (thinking of java and python here) give good hash functions for simple types and tools to combine them, this might be preferred to every key needing to implement the Comparable Interface.
Yes. I like your answer very much. Clearly the answer has to do with the relevance of having an order to your keys.
But consider this. If you want to build a search tree in JavaScript, Go or Python, you are basically on your own. There is no easy solution.
Yet, there are certainly lots of cases where there is a natural ordering. Strings and integers are ordered. Their order is probably relevant in some way to the application.
Still, there is apparently very little need to go through these keys in sorted order.
As already mentioned by another commenter, trees are central to persistent data structures. While persistent data structures are not ubiquitous in mainstream software, my sense is that they are becoming more popular because they help somewhat with making it easier to write parallel and multitasking code.
And relatively recently (in data structure terms; about 15 years again) it was shown that hash-based structures can be made tree-shaped: hash array mapped tries. For any data structure nerds out there who don’t know it already, learning about the HAMT is a treat.
In my experience it’s reasonably common to build search trees layered on top of some other data structure, rather than as a core container library. I agree with the sentiment that for sets and maps, hash-based is often more convenient than comparison-based. However, I don’t think search trees as a concept are going the way of the dodo any time soon.
Simonsays:
On modern CPUs then navigating a larger tree means navigating many more potentially uncached lines of memory. This can make larger tree structures orders of magnitude slower than hash maps.
Example, I recently replaced a tree structure holding all IPv4 ranges with a hash map approach. The tree usually needed 32 branches traversed to get an answer. However, the hash map only touched 3 cache lines and is 20 times faster. However, the reason to switch is because the hash map is 30 times smaller.
1. How could a hash map be 30 times smaller? Surely, there is more to your story…
2. A red-black tree can scale poorly, because of the log n data accesses, but a B-tree ought to fix that…
Simonsays:
There is more to the story. The first 24 bits of each range are used as an array index or perfect hash. The last 8 bits of each range are stored in a sequential blob which has to be brute force searched. However, an interesting side effect of uncached line fetch speeds is that it’s often faster to brute force search (no memory expensive meta data like pointers needed!) within cached lines than to fetch a single uncached line. This brute force approach is unintuitive for most developers. You have to think about the CPU as having a split personality; operations on cached memory are like walking across the street to the store, while operations on uncached memory are like driving to the store and having to deal with parking… You can do a whole lot of walking if you don’t have to drive 🙂
This leads us to the memory prefetch instruction which most data structures ignore. But that’s another story…
KWilletssays:
Cache complexity has indeed surpassed instruction count for in-memory sorting and searching performance. Essentially we’ve moved to an external memory model for anything larger than a few megabytes.
Also, for long keys, such as urls or file paths, with long distinguishing prefixes, structures such as tries do poorly. I suspect there’s been an inflation in key length as memory has become cheaper.
Reinhardsays:
I think this is a merely a symptom of a larger issue:
In academia, algorithms are evaluated using computer models from the 80s (or earlier). However, actual computers work like @Simon said, which is fundamentally different from the old models.
Developers care about actual performance. Academia cares about theoretical performance and rarely compares algorithms using more recent models or actual software. If they do, it is often unoptimized (and write-only) and therefore do not suffice to compare properly.
There are some exceptions (I think you, Daniel, are a particularly good one) but they are quite rare.
Simonsays:
I agree with you, Reinhard. However, I think that most developers are not aware of the cost of caching lines of memory. Which might be okay if their Java or C++ built-in library functions were aware. But sadly most of them are not.
As a high profile developer blog, I would love to see @Daniel blogging about real world experiments illustrating how slow fetching memory is, and how the memory prefetch instruction might be used to make some algorithms ~ 10 times faster.
I would love to see @Daniel blogging about real world experiments illustrating how slow fetching memory is, and how the memory prefetch instruction might be used to make some algorithms ~ 10 times faster.
Page 16 talks about explicitly prefetching using the prefetch instruction. I have found this instruction very difficult to use in a regular algorithm and see any big benefit. Why? It’s very difficult to predict how far ahead of time you need to execute it.
However, I have had great results using the instruction in what I’ll call “batch mode”. Let’s say you have a hash table and have a batch of 20 keys to look up. You hash each key and then in a loop prefetch the random location in memory that you want to access. By the time the last prefetch is done then the code can continue to actually access the memory for the 1st key without any delay because it just got cached. This gives a huge performance boost to the hash table because so much stuff is happening in parallel and not sequentially. So ~ 10 x performance but only when doing batched operations.
It is a very interesting paper. They do seem to make a good use of prefetch in this instance. Note however that regarding binary search, they could have gotten pretty much the same performance characteristics by simply switching from the branchless to the branchy algorithm, depending on data size… and that would have the benefit of using only straight C code.
So we are not talking about making algorithms 10 times faster through prefetching…
KWilletssays:
I would add that cache prediction in general is hard.
Simonsays:
>>> So we are not talking about making algorithms 10 times faster through prefetching… <<<
Not in that particular paper. But had they implemented a batch lookup version which does e.g. 20 lookups concurrently then I believe they could have achieved such a significant speed-up. Unfortunately, they didn't make optimal use of the prefetch instruction IMO.
Map is a type that specifically represents key-value pairs. The stored values are not inherited from the prototype. The keys can be of any type, not necessarily strings.
The documentation on MDN ( Map ) lists the differences as follows:
An Object has a prototype, so there are default keys in the map. However, this can be bypassed using map = Object.create(null).
The keys of an Object are Strings, where they can be any value for a Map.
You can get the size of a Map easily while you have to manually keep track of size for an Object.
Check this out: https://hackr.io/blog/javascript-map
It’s a little surprising to look at textbook algorithms and data structures and see which ones have retained popularity in the real world. A number of sorting algorithms seem also to have fallen by the wayside, to be mentioned in programming interviews and then never again.
With hashes I’m not sure if it’s a big deal or just a lot of usage of the small case, say an associative array with 5-10 members.
I think it is obvious why search trees not seen often. For a search tree you need a total ordering on all keys which implies either that all keys are of the same type or you extend your ordering to an ordering of types which is only a half-order in most (all?) type systems. So for a mapping between arbitrary keys to values there is no obvious way to implement that in search tree.
Whereas a hasmap “only” needs a hashing function (that’s unique and uniformly distributed and so on). So besides having that function, no restrictions are placed on the set of all keys.
In a search tree, the set of all keys need to have a total ordering.
That’s a good point but how often do you have maps where keys have varied types? In JavaScript, all keys to an Object must be a string, and there is a natural (lexicographical) ordering for strings. JavaScript’s Set type supports varied types however. In Go, you have to specify the type, and the type has to implement comparable but may not be ordered. Python natively supports hash maps with varied types.
Not only varied types but compound types also. If you have struct/tuple of an int and a string as keys you have to implement your own compare function that compares first by the int and then by the string (or the other way around) were both ways to sort might not be “natural”. You still need total ordering.
Using hash maps, just use the hash functions of your elements, bitshift and xor them. If there is no natural way to sort your items then there is no need to traverse them in some order and then you don’t have to write some sorting functions.
In databases where b-trees are used as indexes, even for compound keys you need to define an order and most other Indexable database types are naturally orderable.
Of course many things can go wrong in writing a hash function but given that most environments (thinking of java and python here) give good hash functions for simple types and tools to combine them, this might be preferred to every key needing to implement the Comparable Interface.
Yes. I like your answer very much. Clearly the answer has to do with the relevance of having an order to your keys.
But consider this. If you want to build a search tree in JavaScript, Go or Python, you are basically on your own. There is no easy solution.
Yet, there are certainly lots of cases where there is a natural ordering. Strings and integers are ordered. Their order is probably relevant in some way to the application.
Still, there is apparently very little need to go through these keys in sorted order.
Trees are basis for every persistent data structure used in functional programming.
Of course in C++, you can use the boost property_tree, for which each node provides an ordered list of its subnodes.
A couple thoughts:
As already mentioned by another commenter, trees are central to persistent data structures. While persistent data structures are not ubiquitous in mainstream software, my sense is that they are becoming more popular because they help somewhat with making it easier to write parallel and multitasking code.
And relatively recently (in data structure terms; about 15 years again) it was shown that hash-based structures can be made tree-shaped: hash array mapped tries. For any data structure nerds out there who don’t know it already, learning about the HAMT is a treat.
In my experience it’s reasonably common to build search trees layered on top of some other data structure, rather than as a core container library. I agree with the sentiment that for sets and maps, hash-based is often more convenient than comparison-based. However, I don’t think search trees as a concept are going the way of the dodo any time soon.
On modern CPUs then navigating a larger tree means navigating many more potentially uncached lines of memory. This can make larger tree structures orders of magnitude slower than hash maps.
Example, I recently replaced a tree structure holding all IPv4 ranges with a hash map approach. The tree usually needed 32 branches traversed to get an answer. However, the hash map only touched 3 cache lines and is 20 times faster. However, the reason to switch is because the hash map is 30 times smaller.
1. How could a hash map be 30 times smaller? Surely, there is more to your story…
2. A red-black tree can scale poorly, because of the log n data accesses, but a B-tree ought to fix that…
There is more to the story. The first 24 bits of each range are used as an array index or perfect hash. The last 8 bits of each range are stored in a sequential blob which has to be brute force searched. However, an interesting side effect of uncached line fetch speeds is that it’s often faster to brute force search (no memory expensive meta data like pointers needed!) within cached lines than to fetch a single uncached line. This brute force approach is unintuitive for most developers. You have to think about the CPU as having a split personality; operations on cached memory are like walking across the street to the store, while operations on uncached memory are like driving to the store and having to deal with parking… You can do a whole lot of walking if you don’t have to drive 🙂
This leads us to the memory prefetch instruction which most data structures ignore. But that’s another story…
Cache complexity has indeed surpassed instruction count for in-memory sorting and searching performance. Essentially we’ve moved to an external memory model for anything larger than a few megabytes.
Also, for long keys, such as urls or file paths, with long distinguishing prefixes, structures such as tries do poorly. I suspect there’s been an inflation in key length as memory has become cheaper.
I think this is a merely a symptom of a larger issue:
In academia, algorithms are evaluated using computer models from the 80s (or earlier). However, actual computers work like @Simon said, which is fundamentally different from the old models.
Developers care about actual performance. Academia cares about theoretical performance and rarely compares algorithms using more recent models or actual software. If they do, it is often unoptimized (and write-only) and therefore do not suffice to compare properly.
There are some exceptions (I think you, Daniel, are a particularly good one) but they are quite rare.
I agree with you, Reinhard. However, I think that most developers are not aware of the cost of caching lines of memory. Which might be okay if their Java or C++ built-in library functions were aware. But sadly most of them are not.
As a high profile developer blog, I would love to see @Daniel blogging about real world experiments illustrating how slow fetching memory is, and how the memory prefetch instruction might be used to make some algorithms ~ 10 times faster.
I would love to see @Daniel blogging about real world experiments illustrating how slow fetching memory is, and how the memory prefetch instruction might be used to make some algorithms ~ 10 times faster.
This is not a bad idea.
This paper is a start; it goes quite a bit into prefetch and locality effects on real-world hardware: http://arxiv.org/pdf/1509.05053.pdf .
Thanks for posting the link.
Page 16 talks about explicitly prefetching using the prefetch instruction. I have found this instruction very difficult to use in a regular algorithm and see any big benefit. Why? It’s very difficult to predict how far ahead of time you need to execute it.
However, I have had great results using the instruction in what I’ll call “batch mode”. Let’s say you have a hash table and have a batch of 20 keys to look up. You hash each key and then in a loop prefetch the random location in memory that you want to access. By the time the last prefetch is done then the code can continue to actually access the memory for the 1st key without any delay because it just got cached. This gives a huge performance boost to the hash table because so much stuff is happening in parallel and not sequentially. So ~ 10 x performance but only when doing batched operations.
It is a very interesting paper. They do seem to make a good use of prefetch in this instance. Note however that regarding binary search, they could have gotten pretty much the same performance characteristics by simply switching from the branchless to the branchy algorithm, depending on data size… and that would have the benefit of using only straight C code.
So we are not talking about making algorithms 10 times faster through prefetching…
I would add that cache prediction in general is hard.
>>> So we are not talking about making algorithms 10 times faster through prefetching… <<<
Not in that particular paper. But had they implemented a batch lookup version which does e.g. 20 lookups concurrently then I believe they could have achieved such a significant speed-up. Unfortunately, they didn't make optimal use of the prefetch instruction IMO.
Map is a type that specifically represents key-value pairs. The stored values are not inherited from the prototype. The keys can be of any type, not necessarily strings.
The documentation on MDN ( Map ) lists the differences as follows:
An Object has a prototype, so there are default keys in the map. However, this can be bypassed using map = Object.create(null).
The keys of an Object are Strings, where they can be any value for a Map.
You can get the size of a Map easily while you have to manually keep track of size for an Object.
Check this out: https://hackr.io/blog/javascript-map