Daniel Lemire's blog

, 4 min read

19 random digits is not enough to uniquely identify all human beings

6 thoughts on “19 random digits is not enough to uniquely identify all human beings”

  1. Marcin Zukowski says:

    This reminds me of an old idea for hash tables, useful esp. with complex keys.

    With 128-bit hash values, if the hash function is good, the probability of an actual key conflict on the same hash value is smaller than the probability of a server getting hit by a comet 🙂 So we don’t need to do the value equality check, possibly saving a lot of time.

    Alas, I never trusted a hash function enough to actually apply this in a production system.

    1. I love your comment. See my blog post… Hashing and the Birthday paradox: a cautionary tale.

    2. Jimbo says:

      It would be nice if you added the calculation for this probability.

      1. Have you checked out the earlier blog post I link to?

        https://lemire.me/blog/2019/12/12/are-64-bit-random-identifiers-free-from-collision/

      2. Marcin Zukowski says:

        Admittedly, it depends a bit on your assumptions and how you compute things

        There are 3 considerations:

        Size of a hash domain: 2^128
        Number of distinct keys that might be used in the same environment where a collision might cause a problem – here one can argue, I think something like 2^50 is very generous.
        Probability of a meteorite hitting the Earth

        From 1 and 2 we can get the probability of collision of roughly 1 in 10^9.

        Assuming a civilization-destroying meteorite hits us every 1 million years, that means that a probability of it hitting on a given day is more than 1 in 10^9.

        😀

  2. Anonymouse says:

    There’s a very recent preprint addressing exactly this problem: https://arxiv.org/pdf/2304.07109.pdf