Daniel Lemire's blog

, 11 min read

How expensive are the union and intersection of two unordered_set in C++?

16 thoughts on “How expensive are the union and intersection of two unordered_set in C++?”

  1. bsf says:

    It would have been cool to compare std::map and std::set, as these are ordered by design but have the pointer chasing issues.

    1. bsf says:

      Or are you not including the time spent to insert() in your measurements? That’s a little confusing.

      If I have to add the elements to a set in order to sort them (by your proposed algorithm) then we should be including that time.

      Anyways, if you are including insert time then it doesn’t compare a plain prepopulated set, and if you aren’t including insert time then the numbers are not totally correct.

      I think

      1. Or are you not including the time spent to insert() in your measurements? That’s a little confusing.

        I think I describe the benchmark accurately, and I provide the full source code that you can review. There should be no confusion.

        The test is as follow: take two objects of some class (std::vector, std::set, std::unordered_set) and produce a new one of the same class that represents either the union or the intersection.

        So, we do not include the time necessary to create the original two sets, but we measure the time required to create either the intersection or the union.

        Obviously, you can question whether it is “fair” but that becomes subjective.

    2. I have updated my code and the blog post to include results with std::set.

  2. Dan says:

    Have you tried a flat_set implementation? I’d imagine it would avoid some of the data locality cost.

    1. I have not tested Boost’s flat_set.

  3. alanwj says:

    You need to tell us which implementation of C++ you used for any benchmarks to be meaningful.

    It would also be interesting to run your benchmarks on several implementations and compare results.

    1. You need to tell us which implementation of C++ you used for any benchmarks to be meaningful.
      It would also be interesting to run your benchmarks on several implementations and compare results.

      I do a lot better than that… I provide my source code so that you can get your own numbers quickly.

  4. Mathias Gaunard says:

    That comparison is pretty meaningless.
    Of course set_intersection is faster. The costly operation here is sorting.

    But still, an unordered_set is a different thing than a sorted array, it is optimized for lookup, not merging.

    1. The costly operation here is sorting.

      There is no sorting done in the case of unordered_set.

      But still, an unordered_set is a different thing than a sorted array, it is optimized for lookup, not merging.

      I agree but I am not sure it is a priori obvious to the programmer.

    2. Jouni says:

      Sorting integers is actually pretty cheap. I got something like 4-10 cycles/element, depending on the contents of the array.

      The lesson is that random memory accesses are expensive, and it’s hard to avoid them in a data structure that supports updates.

      Mutable data structures are optimized for latency. If you’re more interested in throughput, batch processing with sorted arrays can easily be an order of magnitude faster.

  5. Mutable data structures are optimized for latency. If you’re more interested in throughput, batch processing with sorted arrays can easily be an order of magnitude faster.

    Yes.

  6. Sami says:

    Comment avez vous fait pour calculer le nombre de cycles? Il y’a une option dans le compilateur qui permet ça?
    Merci

    1. Je vous invite à consulter mon code source.

  7. Ciprian says:

    Hello Daniel,

    I have read your source code.
    setunion could be improved by copying one of the sets (the larger, obviously, though in your code the two sets have the same size) instead of “insert”ing it front to back.
    I don’t know what sort of code is generated behind for std::set intersection and union, but generic stuff like the example possible implementations shown on en.cppreference.com will work exceptionally bad for BSTs. And that’s on top of BSTs (std::set is usually a red/black tree) being particularly unsuited for such operations. In fact, if I’d ever want to intersect/join two std::sets, I’d probably copy them to std::vectors (no need of sorting), do the operation on the latter and bulkload somehow the result to the output std::set.

    1. setunion could be improved by copying one of the sets (the larger, obviously, though in your code the two sets have the same size) instead of “insert”ing it front to back.

      Yes. It is still going to be slow, but your proposal would help make it a bit less terrible.