29th March 2018, 5 min read

Should you cache hash values even for trivial classes?

7 thoughts on “Should you cache hash values even for trivial classes?”

Leonid Boytsov says:

March 29, 2018 at 10:56 pm

Daniel,

Java Objects are bloated. For most efficient storage of triples, one needs to use type[3], e.g., int[3].
1. Daniel Lemire says:
  
  March 29, 2018 at 11:04 pm
  
  You cannot use Java arrays as keys in a hash table. Try it.
  1. Leonid Boytsov says:
    
    April 3, 2018 at 3:46 pm
    
    Sorry, I forgot about this. BTW, I tried to keep integers as 3-int array INSIDE an object, but this only slowed things down. In a hindsight, this apparently only creates an extra “lookup layer”.
    1. Daniel Lemire says:
      
      April 3, 2018 at 8:51 pm
      
      Sorry, I forgot about this.
      
      Honestly, I looked it up while preparing this blog post. I could not believe that arrays did not hash properly (as values) in Java. Mind you, I still cannot believe that Java lacks value types.
Jean-Philippe Bempel says:

March 30, 2018 at 3:27 pm

Hi Daniel,

for me it’s not the computation in itself, but accessing data for computing the hash. Zven in L1, accessing data will be slower than performing add and mul operations

Cheers!
1. Daniel Lemire says:
  
  March 30, 2018 at 3:48 pm
  
  for me it’s not the computation in itself, but accessing data for computing the hash. Zven in L1, accessing data will be slower than performing add and mul operations
  
  This can be tested. Lay out the objects in L1 cache and check how many instructions you get per cycle while computing hash values. If you are correct, then you will get far fewer than one instruction per cycle (the CPU will spin empty).
  
  I believe you are not correct but I’d be interested in seeing the numbers.
alfC says:

March 31, 2018 at 4:02 am

Mmm, there must be some big trade off somewhere. To begin, an increase of memory footprint and a decrease in the cache utilization. Mutability will have cost as well. Did you try it in the context of C++?