Daniel Lemire's blog

, 10 min read

Fast float parsing in practice

15 thoughts on “Fast float parsing in practice”

  1. degski says:

    Unfortunately, many standard libraries have not yet caught up the standard and they fail to support from_chars properly.

    On Windows it’s highly optimized (by Stephan T. Lavavej himself). Possibly you could add, which std-libs are lagging, or bad.

    1. Possibly you could add, which std-libs are lagging, or bad.

      I’d be more interested in knowing which one support it. The only one I have seen mentioned is Visual Studio. This will no doubt improve in the future, thankfully.

      On Windows it’s highly optimized

      Currently, this would not produce portable code since most other standard libraries I have tried do not support it. One portable approach is to rely on abseil.

      1. degski says:

        Implicitly, you’ve added it now to the comments.

        With some fiddling (if very important and worth the trouble, and some digging in the relevant docs) one could create an object file with clang-cl and link that in on linux or with MinGW. The thing has more or less a c-api anyway.

        1. Seems easier to use abseil for the time being, no?

  2. Could you please also check against the ‘new’ c++ format? They state that their implementation is also very fast.

    1. As far as I can tell, the “new” way to parse floats in C++ is to use “from_chars” and I address this both in my benchmarks and my post. If you are thinking about something else, would you kindly elaborate?

      1. Aha, ok. I didn’t know that it was based on the from_chars routine (I’m still on C++17). Seeing as this is also quite fast that sounds very good.
        Is there any chance the std::format can use your algorithm, or does that have to go through the committee?

        1. Standard libraries could certainly adopt the approach we have designed.

  3. Vinnie says:

    I’ve been doing some tests and I don’t think this is as fast as the algorithm in RapidJSON: https://github.com/Tencent/rapidjson/blob/7e68aa0a21b7800ec98133cb106e49bd6536e25c/include/rapidjson/internal/strtod.h#L131

    Am I correct in understanding that the goal of your number parser is to produce identical results to the C++ standard implementation, and this is the source of the performance difference from RapidJSON?

    Thanks

    1. You are correct.

      RapidJSON has at least two fast-parsing mode. The fast mode, which I think is what you refer to, is indeed quite fast, but it can be off by one ULP, so it is not standard compliant. Boost Spirit similarly offers fast parsing, but it is not again not standard compliant.

      Our very own simdjson has also a fast number parsing mode…

  4. Albert Chan says:

    David Gay’s dtoa.c had updated (2016) with 96-bits big float, which speed up his earlier version by quite a bit, an order of magnitude or faster in some cases.

    see CHANGES dated 20160429,
    http://www.netlib.org/fp/

    1. Thank you.

      1. Note for people reading this: dtoa goes in the opposite direction.

        1. Marcin says:

          The file name is confusing; dtoa.c contains both dtoa() and strtod(). I suppose Albert meant the latter.

          1. @Marcin Yes and I include benchmark against the updated code at https://github.com/lemire/simple_fastfloat_benchmark

            I do not think that the string parsing has been made orders of magnitude faster. At least, my tests do not reveal much difference.