Daniel Lemire's blog

, 5 min read

Simple table size estimates and 128-bit numbers (Java Edition)

6 thoughts on “Simple table size estimates and 128-bit numbers (Java Edition)”

  1. Travis Downs says:

    To address the precision problem you could also take the first few terms of the binomial expansion of (1 – 1/p)^n – with a large p the terms go to zero essentially immediately so only the first few terms matter. The first term is 1, which nicely cancels out with the the preceeding (1 - part. Finally, you can multiply in the initial p since it cancels out also, since every other term is of the form C / p^m.

    So you are left only with a few terms like A + B/p + C/p^2 … where A, B, C are the binomial coefficents (are these easy to calculate when big? maybe).

    I dunno if that’s better or worse, but it’s a common approach to tackling this kind of problem: re-arranging the problematic part of the formula.

    1. Travis Downs says:

      Forgot to add: I haven’t tried it, so I could definitely be full of it…

    2. Maynard Handley says:

      Since p is HUGE, the zero’th order approximation is going to be pretty much perfect.
      (1-1/p)^n~ 1-n/p,
      expand everything out, and to zeroth’s order the approximation is n.

      If you want to go further, do a Taylor series in 1/p and you get
      n – ((-1 + n) n)/(2 p) + ((-2 + n) (-1 + n) n)/(6 p^2)

      But these correction terms are miniscule.
      The exact (to 20 digits) answer is
      and that correction (of magnitude 10^-12) is the first order in 1/p correction. The second order correction is magnitude 10^-30.

      So, yeah, for all practical purpose, no need for 128-bit arithmetic, just approximate Cardenas ~= n

  2. brian says:

    You can use the lim (1+1/n)^n approximation to $e$ to solve this problem mathematically. An example computation: http://www.moderndescartes.com/essays/birthday_paradox/

  3. This can also be done in standard “double” arithmetic, using the log1p and expm1 functions, which are inteded exactly to counter this sort of instabilities.

    See my answer here for how to perform exactly this calculation: https://math.stackexchange.com/a/2317045/65548 . I guess this method is going to be much faster than using BigDecimal.

    1. Aptroot on twitter suggested product * -Math.expm1(Math.log1p(-1.0/product) * n).