Daniel Lemire's blog

, 5 min read

Are 64-bit random identifiers free from collision?

8 thoughts on “Are 64-bit random identifiers free from collision?”

  1. Interesting! I’d be curious to know why you wrote a post about that. Did you encounter such a problem in your own work?

    1. Yes. Some people report using 64-bit hash values as identifiers and they sometimes believe that it makes collision highly improbable. That’s true when dealing with tiny sets, but as I demonstrate, for many practical cases, collisions are actually quite likely.

      So, yes, this post is motivated by my discussions with people building systems.

  2. Peter says:

    And with 64-bits, they are easy to find! I did this exercise recently with FNV-1a: https://twitter.com/pstbrn/status/1201044024892829697

    1. Peter says:

      (This is hash collisions, not random identifiers. But finding them used the same principle – birthday attack.)

  3. Randomly generated UUIDs, which are 122 bits, are less problematic: the probability of getting a duplicate is much lower than the probability of being hit by a meteorite (I calculated that some time ago and put it on Wikipedia, but somebody edited it away).

    128-bit values are even better. However, there is a risk if you generate them with a “lower quality” hash function (not sure what low quality is… Murmur hash seems OK, SipHash is better I think). If an attacker can supply the data, then MD4, MD5, and even SHA-1 are risky: the attacker could use specially crafted data that will result in a hash collision. So, for this case, better use SHA-256.

  4. John the Scott says:

    what about a 160 bit hash built by ripemd(sha256), i.e, bitcoin?

    1. 160-bit is perfectly fine.

      There is a huge difference between 64-bit and 160-bit.

  5. Ricardo says:

    And that is why I always choose sequential id assignment instead of random 🙂

    This is, however, an interesting idea. What would be the probability to collide twice in a row if whenever we find a collision we retry and generate a new random 64-bot id?