21st October 2021, 9 min read

Converting binary floating-point numbers to integers

15 thoughts on “Converting binary floating-point numbers to integers”

Josh Haberman says:

October 21, 2021 at 1:16 am

I think the first example is undefined behavior in the case that the double is not representable in the target integer type: https://godbolt.org/z/54aq4r5oj
Nicolas says:

October 21, 2021 at 7:37 am

Very interesting post.
Don’t you need to shift bits left by 1 before testing equality to zero (to account for +0 and -0) ?
1. Daniel Lemire says:
  
  October 21, 2021 at 6:13 pm
  
  See my other reply: it is debatable whether -0 can be represented as an uint64 value.
  
  (Your comment was not lost, it just needed approval.)
Urgau says:

October 21, 2021 at 9:15 am

I’ve executed your code on one of my armhf machine and the results are surprising.

$ time ./isint.g++
499999999500000000
39.8756
499999999500000000
96.3353

real 4m32,431s
user 4m32,237s
sys 0m0,196s

$ time ./isint.clang++
499999999500000000
28.1106
499999999500000000
98.3311

real 4m12,893s
user 4m12,060s
sys 0m0,508s

Your code is way faster on my armhf machine.
Note that g++ (Debian 8.3.0-6) 8.3.0 and clang version 7.0.1-8+deb10u2 were not the most recent version.
Nicolas B. says:

October 21, 2021 at 3:39 pm

Very interesting post.
Shouldn’t bits be left shifted by one bit before testing zero with (bits == 0) to account for +0 and -0 cases ?

(my comment may have been lost the first time)
1. Daniel Lemire says:
  
  October 21, 2021 at 6:12 pm
  
  It is debatable whether -0 can be represented as an uint64 value.
Todd Lehman says:

October 22, 2021 at 1:58 am

As a special case, I wonder how fast it would be for a custom shortcut routine to convert a known-integer value stored in a 32- or 64-bit floating-point variable into a uint64_t value?

What I mean is: If you happen to know that a floating-point value is an integer (which can certainly happen sometimes), can the conversion be performed measurably faster?
1. Daniel Lemire says:
  
  October 22, 2021 at 5:07 pm
  
  It is an interesting question.
Jorge says:

October 22, 2021 at 9:28 am

If I remember correctly, there were some instruction extensions in Arm to support this kind of operations directly, with the aim to accelerate conditional code in Javascript, where all numbers have type double.
See “Improved Javascript data type conversion”: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-a-architecture-2016-additions
1. Daniel Lemire says:
  
  October 22, 2021 at 5:25 pm
  
  Ah. So it would work much the same as my simple approach but a flag is set when the conversion is not exact. Interesting.
Lars Bonnichsen says:

October 23, 2021 at 11:51 am

If the number fits into the mantissa, then the integer conversion can be done with 1 float and 1 integer addition, which should be faster

https://paperzz.com/doc/8232041/fast-rounding-of-floating-point-numbers-in-c-c–
Dmitry Akimov says:

October 25, 2021 at 10:23 am

Hello Daniel. In the modern C++ the legal way and optimal way for binary conversion between unrelated types is to use std::memcpy or std::bit_cast.
1. Daniel Lemire says:
  
  October 25, 2021 at 11:40 am
  
  I am aware but a union is also legal, is it not?
  1. Nicolas B. says:
    
    October 25, 2021 at 8:33 pm
    
    I thought unions were only legal if you write and read from the same field (not possible to write a field an interpret the data another way by reading another one). But I may be mixing C/C++ standards
    1. Daniel Lemire says:
      
      October 25, 2021 at 9:08 pm
      
      It seems that you are correct:
      “It’s undefined behavior to read from the member of the union that wasn’t most recently written. Many compilers implement, as a non-standard language extension, the ability to read inactive members of a union.”