Daniel Lemire's blog

, 5 min read

Encoding binary in ASCII very fast

8 thoughts on “Encoding binary in ASCII very fast”

  1. Jeff Epler says:

    Is the input range actually up to 256, rather than 248 as the comment currently suggests?

    1. Yes. You are correct.

  2. Gil Pedersen says:

    To encode in four instructions:

    uint64_t convert_to_ascii(uint64_t x) {
    return ((0x2040810204081 * (x & 0x80808080808080)) & 0xff80808080808080) ^ x;
    }

  3. Travis Downs says:

    If you are willing to use a bigger chunk size, you can probably do it faster with something like:

    uint64_t encode(uint64_t bits, uint64_t& high) {
    high = (high >> 1) | (bits & 0x8080808080808080) ;
    return bits & 0x7F7F7F7F7F7F7F7F;
    }

    // encode a 64-byte input chunk into 72 output bytes
    uint64_t high = 0;
    for (size_t i = 0; i < 8; i++)
    result[0] = encode(input[0], high);
    result[8] = high;

    By my count this is fewer operations and all basic ops: no potentially expensive multiply. Also, it processes 8 bytes at a time, not 7, which keeps all accesses aligned (this matters much more on non-x86).

    On x86, you could take advantage of LEA, which can do (bits << 1) + x in a single instruction, but because you’re shifting left, not right, I think you can only process 7 bytes at a time (like your suggested algorithm), losing some benefit. Still, even at 7 bytes i calculate this as fewer total ops.

    1. randomPoster says:

      You must stop your loop one step before, and encode 56bytes in 64bytes, otherwise result[8] will not fit in ascii.

      1. Travis Downs says:

        Yes, that’s right: it should only be 7 iterations of the inner loop before storing the upper bits.

  4. Travis Downs says:

    The formatter wrecks the indentation in the code samples, but that should be:

    for (size_t i = 0; i < 8; i++) result[i] = encode(input[i], high);
    result[8] = high;

  5. Steven Stewart-Gallus says:

    Less instructions is not always faster.
    Not sure what the best way to exploit ILP is though.