21st July 2016, 3 min read

Accelerating PHP hashing by “unoptimizing” it

Isinlor says:

July 22, 2016 at 8:00 am

I allowed myself to post your finding on PHP reddit to spread them in community 🙂 .

https://www.reddit.com/r/PHP/comments/4u1l5z/accelerating_php_hashing_by_unoptimizing_it/
1. mmc says:
  
  July 22, 2016 at 5:37 pm
  
  33 is not prime 🙂
  1. Daniel Lemire says:
    
    July 22, 2016 at 7:44 pm
    
    Point taken. It is odd so coprime with respect to powers of two.
Simon says:

July 22, 2016 at 10:18 pm

I raced the DJB hash, the DJB hash with multiplier, and murmurhash3 on a Haswell CPU with ~ 8000 real keys which get hashed when ‘php -v’ in invoked.

I found that they all run at about the same speed on the Haswell. mmh3 never gets into its stride because although it doesn’t hash inefficiently byte by byte, it does do for the tail block of up to 15 bytes, and all of the real keys are under 39 bytes long.

So I raced again using the real set of keys which had been padded to the next 16 byte boundary. This means mmh3 never had to run its slow byte by byte tail code. The result is that mmh3 is nearly 50% faster than the DJB hash.

This makes me wonder how easy it would be to replace the DJB hash in PHP with mmh3, and change PHP strings so that they are always stored padded to the next 16 byte boundary. It would be interesting to see if any larger PHP scripts end up running faster?