27th January 2017, 11 min read

How expensive are the union and intersection of two unordered_set in C++?

16 thoughts on “How expensive are the union and intersection of two unordered_set in C++?”

bsf says:

January 28, 2017 at 3:30 am

It would have been cool to compare std::map and std::set, as these are ordered by design but have the pointer chasing issues.
1. bsf says:
  
  January 28, 2017 at 3:32 am
  
  Or are you not including the time spent to insert() in your measurements? That’s a little confusing.
  
  If I have to add the elements to a set in order to sort them (by your proposed algorithm) then we should be including that time.
  
  Anyways, if you are including insert time then it doesn’t compare a plain prepopulated set, and if you aren’t including insert time then the numbers are not totally correct.
  
  I think
  1. Daniel Lemire says:
    
    January 28, 2017 at 4:02 am
    
    Or are you not including the time spent to insert() in your measurements? That’s a little confusing.
    
    I think I describe the benchmark accurately, and I provide the full source code that you can review. There should be no confusion.
    
    The test is as follow: take two objects of some class (std::vector, std::set, std::unordered_set) and produce a new one of the same class that represents either the union or the intersection.
    
    So, we do not include the time necessary to create the original two sets, but we measure the time required to create either the intersection or the union.
    
    Obviously, you can question whether it is “fair” but that becomes subjective.
2. Daniel Lemire says:
  
  January 28, 2017 at 3:54 am
  
  I have updated my code and the blog post to include results with std::set.
Dan says:

January 28, 2017 at 3:32 pm

Have you tried a flat_set implementation? I’d imagine it would avoid some of the data locality cost.
1. Daniel Lemire says:
  
  January 30, 2017 at 3:11 pm
  
  I have not tested Boost’s flat_set.
alanwj says:

January 28, 2017 at 10:27 pm

You need to tell us which implementation of C++ you used for any benchmarks to be meaningful.

It would also be interesting to run your benchmarks on several implementations and compare results.
1. Daniel Lemire says:
  
  January 30, 2017 at 3:03 pm
  
  You need to tell us which implementation of C++ you used for any benchmarks to be meaningful.
  It would also be interesting to run your benchmarks on several implementations and compare results.
  
  I do a lot better than that… I provide my source code so that you can get your own numbers quickly.
Mathias Gaunard says:

January 29, 2017 at 1:27 pm

That comparison is pretty meaningless.
Of course set_intersection is faster. The costly operation here is sorting.

But still, an unordered_set is a different thing than a sorted array, it is optimized for lookup, not merging.
1. Daniel Lemire says:
  
  January 30, 2017 at 2:56 pm
  
  The costly operation here is sorting.
  
  There is no sorting done in the case of unordered_set.
  
  But still, an unordered_set is a different thing than a sorted array, it is optimized for lookup, not merging.
  
  I agree but I am not sure it is a priori obvious to the programmer.
2. Jouni says:
  
  January 30, 2017 at 7:02 pm
  
  Sorting integers is actually pretty cheap. I got something like 4-10 cycles/element, depending on the contents of the array.
  
  The lesson is that random memory accesses are expensive, and it’s hard to avoid them in a data structure that supports updates.
  
  Mutable data structures are optimized for latency. If you’re more interested in throughput, batch processing with sorted arrays can easily be an order of magnitude faster.
Daniel Lemire says:

January 30, 2017 at 7:53 pm

Mutable data structures are optimized for latency. If you’re more interested in throughput, batch processing with sorted arrays can easily be an order of magnitude faster.

Yes.
Sami says:

September 20, 2019 at 8:06 am

Comment avez vous fait pour calculer le nombre de cycles? Il y’a une option dans le compilateur qui permet ça?
Merci
1. Daniel Lemire says:
  
  September 20, 2019 at 3:36 pm
  
  Je vous invite à consulter mon code source.
Ciprian says:

May 11, 2021 at 9:40 pm

Hello Daniel,

I have read your source code.
setunion could be improved by copying one of the sets (the larger, obviously, though in your code the two sets have the same size) instead of “insert”ing it front to back.
I don’t know what sort of code is generated behind for std::set intersection and union, but generic stuff like the example possible implementations shown on en.cppreference.com will work exceptionally bad for BSTs. And that’s on top of BSTs (std::set is usually a red/black tree) being particularly unsuited for such operations. In fact, if I’d ever want to intersect/join two std::sets, I’d probably copy them to std::vectors (no need of sorting), do the operation on the latter and bulkload somehow the result to the output std::set.
1. Daniel Lemire says:
  
  May 11, 2021 at 10:22 pm
  
  setunion could be improved by copying one of the sets (the larger, obviously, though in your code the two sets have the same size) instead of “insert”ing it front to back.
  
  Yes. It is still going to be slow, but your proposal would help make it a bit less terrible.