, 3 min read
In C++, is empty() faster than comparing the size with zero?
Most C++ programmers rely on “STL” for their data structures. The most popular data structure is probably vector, which is just a dynamic array. The set and the map are other useful ones.
The STL data structures are a minimalist design. You have relatively few methods. All of them allow you to compute the size of the data structure, that is, how many elements it contains, via the size() method. In recent C++ (C++11), the size() method must have constant-time complexity for all containers. To put it in clearer terms, the people implementing the containers can never scan the content to find out the number of elements.
These containers also have another method called empty() which simply returns true of the container is… well… empty. Obviously, an equivalent strategy would be to compare the size with zero: mystruct.size() == 0.
Determining whether a data structure is empty is conceptually easier than determining its size. Thus, at least in theory, calling empty() could be faster.
Inspecting the assembly output, I find that recent versions of GCC produce nearly identical code for the comparison of the size and the empty call. The exception being the list data structure where the assembly is slightly different, but not in a manner that should affect performance.
Of course, there are different implementations of C++ and it is possible that other implementations could provide more efficient code when calling empty(). An interesting question is whether effort is needed from the compiler.
Travis Downs wrote a list data structure by hand, but with a size() function that is linear time. He then implemented the empty function naively:
struct node {
struct node *next;
int payload;
};
int count_nodes(const node* p) {
int size = 0;
while (p) {
p = p->next;
size++;
}
return size;
}
bool is_not_empty(const node* p) {
return count_nodes(p) > 0;
}
Amazingly, we find that the GCC compiler is able to compile Travis’ is_not_empty
C++ function to constant-time code. The compiler inlines the count_nodes function into is_empty. Then the compiler figures out that as soon as you enter the loop once with count_nodes, then size is going to be greater than zero, so there is no need to keep looping.
However, the optimisation is not robust. Suppose that I wish instead to return an unsigned type instead of Travis’ int value. The problem with an unsigned type is that I might overflow if the list is very long. With a signed integer, the compiler is allowed to assume that overflows do not happen. It could be difficult for the compiler to tell whether count_nodes() return 0 or not, if the compiler must handle overflows. To handle this potential issue, I can forcefully bound the return value of count_nodes() to be no more than 1000. If I change the code to return a standard size_t type, like so…
#include <cstddef>
struct node {
struct node *next;
int payload;
};
size_t count_nodes(const node* p) {
size_t size = 0;
while (p) {
p = p->next;
size++;
if(size == 1000) {
return 1000;
}
}
return size;
}
bool is_not_empty(const node* p) {
return count_nodes(p) > 0;
}
Sadly, GCC is now unable to optimize away the call. Maybe compilers are not yet all-powerful beings?
The lesson is that it is probably wise to get in the habit of calling directly empty() if you care about performance. Though it may not help much with modern STL data structures, in other code it could be different.
Of course, another argument is that the call to empty() is shorter and cleaner.
Credit: This blog post was motivated by a tweet by Richard Startin.