Daniel Lemire's blog

, 6 min read

Compressed bitset libraries in C and C++

9 thoughts on “Compressed bitset libraries in C and C++”

  1. Rob says:

    Hi Daniel,

    Nice post! I’ve peeked around a few of these myself before in a search for a good dynamically-sized bitset for C++. Do you have any thoughts on if it might make sense to add rank and select operations to any of these representations? Usually, when I’m using a compressed (or even uncompressed) bit-vector, rank, select and access are the 3 operations I’m most in need of.

    1. CRoaring supports rank and select in both the C and C++ interfaces (and in Java and Go), https://github.com/RoaringBitmap/CRoaring/blob/master/include/roaring/roaring.h

  2. Alex Chen says:

    How would you compare the bitset library in stl bundled in standard c++ with the libraries you listed. Is there any use case where it is clearly advantageous to use third party bitset libraries as opposed to the standard library? Thanks!

    1. I present a comparison with STL data structures in this recent talk reproduced on YouTube: https://youtu.be/ubykHUyNi_0

      I have also covered the topic in various ways on my blog over the years… http://lemire.me/blog/2012/11/13/fast-sets-of-integers/
      http://lemire.me/blog/2017/01/27/how-expensive-are-the-union-and-intersection-of-two-unordered_set-in-c/

      And I’ll write more.

  3. Benoit says:

    Hello and thank you for the links.
    Are those libraries suitable for building a Bloom filter?
    I understand they are overkill, but do their storage and read access performances make the a good choice?

  4. Arthur Silva says:

    None of these libraries expose functions to shift (as in bit shift) bits around, I guess these operations don’t play well with compression. Are there any compressed bitset algorithms that could potentially support these operations with reasonable efficiency?

    1. The javaewah has a shift function, see http://www.javadoc.io/doc/com.googlecode.javaewah/JavaEWAH/1.1.6

      It has not been implemented in the C/C++ libraries simply because there has been no interest shown for the feature.

      Of course, a shift operation in generally linear time… so shifting bitsets arbitrarily and frequently is maybe unwise.

  5. Max Lybbert says:

    There is also FastBit ( https://sdm.lbl.gov/fastbit/ ).

  6. lily says:

    Are those libraries reasonable for building a Bloom channel?

    Probably not.