Daniel Lemire's blog

, 3 min read

Can you trust fixed-bit computer arithmetic?

Suppose that you have 10 pictures, and all lined up, they take 100 pixels. Is it safe to say that each picture has a width of x pixels if 10 x = 100?

We all know that a x = b has a unique solution x as long as a is non-zero. If you work with integers, then you can say that there is at most one solution.

Unfortunately, computer arithmetic is uglier. Over 32-bit integers, the equation a x= 0 has 231 solutions if a = 231. Indeed, as long as x is even, 231 x = 0.

But notice how I had to choose a large value of a to make computer arithmetic look bad. What if a is small? Certainly, if a = 1, then there is exactly one solution. If a = 2, then there are at most two solutions: 2 x can take any odd even and the most significant bit of x is discarded.

We can generalize this result a bit: if a = 2L, then there are at most 2L solutions x.

But we can generalize it even further! We have that a x = b has at most 2L integer solutions x if a is an L-bit non-zero integer. That is, as long as a__ is small, then__ a x__ =__ b__ has few solutions__. The proof is technical, but not difficult:

Proof (Sketch). Let a is an L-bit non-zero integer and suppose we work with K-bit arithmetic. We consider the equation a x div 2L = b modulo 2K – L. Let j be the first non-zero bit of a. E.g., if a = 7 then j = 1. Because ai = 0 for i < j, we have that the value of a x 2L is independent of the last j -1 bits of x. Moreover, we have that the most significant bit of x which matters (xK-j+1) only matters for the last bit of a x. We can solve for this last effectual bit xK-j+1 in (ax)K = bK-L as a function of bK-L, a and the lesser bits of x. We can continue solving for the lesser bits of x by consider a x = b minus the last bit. (In effect, we have reduced the problem in an integer ring with K bits to a similar problem in integer ring with K-1 bits.) After K-L steps, the process terminates (K=L) leaving L bits in x that can be set freely generating 2L different solutions. QED

Getting back to our orignal problem: Because the integer 10 can be written using 4 bits (0b1010), the equation 10 x = 100 has at most 24 = 16 solutions. So if x was just picked at random, chances are good that the assertion would fail: for 32-bit integers, the probability of a false positive is at most 15/232 which is much, much less than 1%.

Further reading: M. Dietzfelbinger, Universal hashing and k-wise independent random variables via integer arithmetic without primes, in: STACS’96, 1996.