19th May 2014, 5 min read

Decoding over 4 billion integers per second in C

8 thoughts on “Decoding over 4 billion integers per second in C”

Alecco says:

May 19, 2014 at 2:13 pm

Excellent. Very easy to use! Thanks!
anonymous says:

May 19, 2014 at 4:30 pm

Has FastPFOR been used/evaluated in a real contex such as lucene text search?
1. Sergey Nikolaev says:
  
  July 23, 2021 at 3:30 am
  
  It’s used in https://github.com/manticoresoftware/columnar which can be used with Manticore Search – https://github.com/manticoresoftware/manticoresearch/
Daniel Lemire says:

May 20, 2014 at 8:10 am

@anonymous

Lucene uses what is effectively the FastPFOR algorithm inspired by the JavaFastPFOR library. As for using the C++ library, I do not know if it is practical since Lucene is written in Java.
powturbo says:

May 28, 2014 at 5:07 am

In my tests SIMD bitpacking offer no speed advantage over optimized scalar bitpacking when used with large buffers (see simplebenchmark in FastPFor). This is valable for most applications (ex. inverted index). A realistic benchmark should compare SIMD/Scalar bitpacking only on large buffers.
Daniel Lemire says:

May 28, 2014 at 8:16 am

@powturbo

If you are going to take the data from RAM, bring it all the way to L1 cache, load it in registers, then push it out all the back to RAM… you are IO bound… your CPU runs empty and so, saving CPU cycles becomes irrelevant. To make things worse, you can pretty much forget about using more than one core because your L3 cache is going to be overwhelmed by one core.

So? So you avoid decompressing whole arrays to RAM.

We have demonstrated directly the benefit of SIMD bit packing in our latest paper (see http://arxiv.org/abs/1401.6399).
Garen says:

May 30, 2014 at 3:59 pm

If you’re out of disk-space, is there a way to handle updates in a way that won’t require additional scratch space?
Daniel Lemire says:

May 30, 2014 at 4:43 pm

@Garen

This particular library does not handle disk storage at all (by design). However, there is no particular problem with updates and this library. In fact, it compresses super fast so recompressing updating blocks should be quite fast.