Daniel Lemire's blog

, 5 min read

Fast divisionless computation of binomial coefficients

4 thoughts on “Fast divisionless computation of binomial coefficients”

  1. George Spelvin says:

    I know you know this, but to clarify for others: a division may require a multiply and two shifts (case 3 in the Granlund-Montgomery-Warren algorithm) or more (case 4). Computing binomial coefficients can use a single shift because it falls into the special case of a division which is known a priori to be exact.

    Also, you can eke a tiny bit more range out of fastbinomial(n,k) if you do the multiplication by f.inverse before the shift.

    The largest value which must be represented internally within fastbinomial is fastbinomial(n,k) * k; if that overflows 64 bits, the result will be inaccurate. However, the overflow does not matter for the multiplication by the inverse, only the shift, so only fastbinomial(n,k) << f.shift needs to fit into 64 bits.

    1. Also, you can eke a tiny bit more range out of fastbinomial(n,k) if you do the multiplication by f.inverse before the shift.

      I do not understand this comment. Can you elaborate?

    2. Computing binomial coefficients can use a single shift because it falls into the special case of a division which is known a priori to be exact.

      And if anyone is wondering why the division is exact (like I was), I wrote up an explanation here: https://ro-che.info/articles/2020-03-22-binomial-coefficients-integer-division

  2. Sujay Jayakar says:

    I was just working on a similar problem, and for my application I have 0 <= n <= 64 and 0 <= k <= 64, and storing a lookup table is acceptable. Here’s a trick I used for using symmetry to only store half of each row.

    We start by precomputing all binom(n, k) via the recurrence B(n, k) = B(n - 1, k - 1) + B(n - 1, k) and B(n, n) = 1 and concatenating the rows into a flat array. Then, computing B(n, k) involves looking up the start of the nth row and taking the kth element of that row. The ith row has length i + 1, so the nth row starts at 1 + 2 + ... + n, which is n * (n + 1) / 2.

    1
    1 1
    1 2 1
    1 3 3 1
    1 4 6 4 1

    But, we’d like to store only the first half of each row, computing B(n, k) as B(n, min(k, n - k)).

    1
    1
    1 2
    1 3
    1 4 6

    Now, the length of the nth row is ceil((i + 1) / 2), and since ceil((n + 1) / m) = floor(n / m) + 1, we can compute this as n // 2 + 1 using integer division. To compute the start of the nth row in our table, we want to compute \sum_{i=0}^{n-1} {i // 2 + 1} in closed form.

    n: 0 1 2 3 4 5 ...
    row_len: 1 1 2 2 3 3 ...
    row_start: 0 1 2 4 6 9 ...

    If we have even n = 2m, then the sum of the previous row_lens is just 2 * (1 + 2 + ... m):

    row_start(2m) = 2 * (1 + 2 + ... + m)
    = 2 * m * (m + 1) / 2
    = m * (m + 1)

    Then, if we have odd n = 2m + 1, we need to add in row_len(2m).

    row_start(2m + 1) = m * (m + 1) + row_len(2m)
    = m * (m + 1) + (m + 1)
    = (m + 1) * (m + 1)

    So we can now combine the two cases:

    row_start(n) = (n // 2 + n % 2) * (n // 2 + 1)

    The resulting table is compact, and querying it is efficient.

    fn binomial_coefficient(n: u8, k: u8) -> u64 {
    let (q, r) = (n as usize / 2, n as usize % 2);
    let row_start = (q + r) * (q + 1);
    let k = cmp::min(k, n - k) as usize;
    COEFFICIENT_TABLE[row_start + k]
    }

    My code is in this pull request.