Daniel Lemire's blog

, 6 min read

Run-length encoding (part 2)

7 thoughts on “Run-length encoding (part 2)”

  1. Itman says:

    Thank you for the interesting post.
    I would note, however, that not all universal codes are born equal. Specifically, byte-aligned codes and word-aligned codes are at least twice as fast as bit-oriented codes such as Elias delta/gamma or Golomb codes.
    They are used in search engines and allow very fast decompression. In many cases it takes the same time as to read the data from disk sequentially.
    See, e.g. “Index Compression using Fixed Binary Codewords. Vo Ngoc Anh, Alistair Moffat”
    PS: I also gonna read “How to Barter Bits for Chronons”. Seems to be very interesting. Thanks!

  2. @Itman Thank you for the great reference. I liked the Fixed Binary Codewords paper very much.

  3. Glenn Davis says:

    I’m compelled to take issue with your last sentence…

    “Unfortunately, the current breed of microprocessors are not kind to variable-length representations so the added compression is at the expense decoding speed.”

    My recent experience has tought me that compression and speed are no longer related that way, and largely for that reason. Today’s hierarchies of caches work in concert with out-of-order execution and other features to provide new avenues for the designer to exploit. These modern architectural features can be made to reward, with execution speed, high density in data. It is up to the designer to get the density, as well as the logic, that lets the hardware deliver the speed. To me, that’s the new reality.

    I say that with some conviction having just finished optimizing a fast decompressor for structured data. It uses canonical Huffman codes for the data, and a compressed variation of J. Brian Connell’s classic structures for decoding. During software optimization, time and time again, I was able to get further speed improvements by increasing the compression not only of the data, but also of the decoding data structures and their pointers. It was the variable-length coding, as much as any other design factor, that got me the information density I needed from the data to get the speed I needed from the system. In the end, that happened, I believe, primarily because the use of variable-length codes reduced the demand on a relatively slow path component, the system bus.

    Software optimization is not what it was years ago, and for me at least, neither are the relationships between compression and speed. But that won’t be everyone’s experience, so I would like to hear others’ opinions.

  4. Itman says:

    I have read the references on Chronos and Bits. It looks like one should use terms variable-bit and variable-byte methods very cautiously. It is also interesting that Huffman coding can be sped up considerably by using special lookup tables.
    My experience and a variety of experimental evaluations (just check the reference given by the author), say that in many cases more sophisticated compression methods introduce speed penalty. In particular, variable-bit methods are usually slower (but not always), that variable-byte methods.
    The difference, however, is subtle. In many cases, obviously, better compression rates allows to avoid expensive cache misses and even more expensive disk reads. In those case, better compression is obviously a priority.

  5. Glenn Davis says:

    I agree. It could go either way; so much depends on the specifics. But the stangest thing is that so often I find myself increasing compression in order to increase speed, and winning!

    BTW, there is an old paper (and a good one) by Debra Lelewer and Dan Hirschberg (CACM 4/90) that explores a lot of the Huffman code decoding issues.

  6. Itman says:

    “Efficient Decoding of Prefix Codes”?
    Looks like it is worth reading, thanks.

  7. Kevembuangga says:

    @Glenn Davis

    It was the variable-length coding, as much as any other design factor, that got me the information density I needed

    When confronted with a “problem” sometimes the best approach isn’t to solve it but to avoid it.
    Like, why using counters to spot peculiar points within an address range when you can use flags (bits), interleaved in data or not (sparse bit maps). 🙂