Daniel Lemire's blog

, 2 min read

Don’t read your data from a straw

3 thoughts on “Don’t read your data from a straw”

  1. Me says:

    One of the things that can hurt Java here is the classic big endian vs. little endian problem…
    In a lot of cases, Java is prepared to swap endianess to be compatible across different CPU architectures. Something where in C you usually have to manually insert htonl and ntohl calls etc. – is all your code endianess safe?

    1. Thanks for raising this point.

      In the case above, we do Java-vs-Java comparisons so endianness is not an issue.

      In both Java and C/C++, you sometimes need to flip the bytes around. In C/C++, you have to check whether you have a big endian or little endian system, whereas with Java, it is always big endian. Yet, in my own experience, I have been able to safely assume that all systems I care about are little endian. So I have designed binary formats that are explicitly little endian.

      This being said, the computational burden of reversing byte order is tiny.

  2. Roman Leventov says:

    ByteBuffer has very unfortunate API. Pretty much all systems/high-performance projects in Java (Netty, Aeron, Chronicle, etc) reimplement it on their own.

    We’ve built Memory project (link) specifically to back up data structure implementations in Java. It is used in DataSketches (link).