, 1 min read
How fast is getline in C++?
A standard way to read a text file in C++ is to call the getline function. To iterate over all lines in file and sum up their length, you might do as follows:
while(getline(is, line)) {
x += line.size();
}
How fast is this?
On a Skylake processor with a recent GNU GCC compiler without any disk access, it runs at about 2 GB/s. That’s slower than the maximal throughput of a good flash drive. These results suggest that reading a text file in C++ could be CPU bound in the sense that buying an even faster disk would not speed up your single-threaded throughput.
nanoseconds per byte | 0.46 ns |
---|---|
speed | 2.0 GB/s |
If you write code that processes the strings generated by the getline function calls, in the worst case, the total time will be the sum of the time required by the getline function plus the time required by your code. In other words, you are unlikely the achieve speeds near 2 GB/s.
In comparison, a software library like simdjson can parse and validate JSON inputs, doing everything from Unicode validation to number parsing, at a speed of over 2 GB/s.
I have not tried to do so, but you can locate lines and iterate over them at much greater speeds than 2 GB/s.