Daniel Lemire's blog

, 5 min read

Should you cache hash values even for trivial classes?

7 thoughts on “Should you cache hash values even for trivial classes?”

  1. Leonid Boytsov says:

    Daniel,

    Java Objects are bloated. For most efficient storage of triples, one needs to use type[3], e.g., int[3].

    1. You cannot use Java arrays as keys in a hash table. Try it.

      1. Sorry, I forgot about this. BTW, I tried to keep integers as 3-int array INSIDE an object, but this only slowed things down. In a hindsight, this apparently only creates an extra “lookup layer”.

        1. Sorry, I forgot about this.

          Honestly, I looked it up while preparing this blog post. I could not believe that arrays did not hash properly (as values) in Java. Mind you, I still cannot believe that Java lacks value types.

  2. Hi Daniel,

    for me it’s not the computation in itself, but accessing data for computing the hash. Zven in L1, accessing data will be slower than performing add and mul operations

    Cheers!

    1. for me it’s not the computation in itself, but accessing data for computing the hash. Zven in L1, accessing data will be slower than performing add and mul operations

      This can be tested. Lay out the objects in L1 cache and check how many instructions you get per cycle while computing hash values. If you are correct, then you will get far fewer than one instruction per cycle (the CPU will spin empty).

      I believe you are not correct but I’d be interested in seeing the numbers.

  3. alfC says:

    Mmm, there must be some big trade off somewhere. To begin, an increase of memory footprint and a decrease in the cache utilization. Mutability will have cost as well. Did you try it in the context of C++?