I’ll check the OpenJDK project to see whether that’s a reasonable change.
Steve Davidsonsays:
It’s an overdue change.
Steve Davidsonsays:
Let me know if assistance is needed — the Java NIO Package is using BufferedReader as well and the Mutex locks in StringBuffer is causing nasty and unnecessary performance hits.
The ideal buffer size and line size aren’t really related. The buffer size is for reading large(ish) chunks of the file which are then parsed into lines. Selecting the buffer size is mostly a function of system call overhead versus the desire to keep stuff in L1. I have repeatedly found 8K to be a sweet spot, although that was pre-Meltdown/Spectre which might have pushed the ideal buffer size up.
mt3osays:
You should try resizing the buffer to different sizes and rerun the benchmark. There is no magic behind buffered reader, everything is synchronous and it’s filling with data once the buffer is emptied.
What’s the point of this post? Java is not easier to write [than C or C++], probably harder, one needs [to have installed] a VM and it’s slow, so what’s to like?
folderksays:
What’s the point of this post? Java is not easier to write [than C or C++], probably harder, one needs [to have installed] a VM and it’s slow, so what’s to like?
Java is much easier to write than C or C++ (for one, you don’t have to manually manage memory, and for second, it has much less corner points and nuances than C++), it’s plenty fast for most tasks (and on par with C/C++ on some), and installing a VM is a non issue.
So you comment in wrong in each and every statement it makes…
for one, you don’t have to manually manage memory, and for second, it has much less corner points and nuances than C++.
Using RAII and smart pointers does away with manual memory management, forget C with classes, we moved on from there.
Yes, it’s subtle and one needs to master it, it does not in itself mean it’s hard to write [and it got simpler to write fast code since C++1 and following std updates].
I don’t need to ask my user to install the JVM, that seems like a major advantage.
Forgot the most important bit, it is 2 times [with the optimizations in some of the other answers, 4 as posted] than plain C++. That’s the difference between google needing a mere 500’000 servers [to conduct its business] as compared to 1’000’000.
Bob Fostersays:
What is the point of your comment? I get you don’t like Java, but it’s hardly relevant to the speed of I/O.
Steve Davidsonsays:
Folks like you have been providing about 1/2 the work that I get, so THANKS! Java WAS slow, in the 90’s and early 2000’s. JIT around 2000 & the Runtime Profiler’s and Optimizers introduced around 2005 made Java quite performant. And there are lots of “extras” that the VM is providing if you are doing anything that needs Database, XML, WebServices, or any other non-trivial application.
In contrast to the C getline function, BufferedReader.readLine and BufferedReader.lines does not include the newline character in the returned strings. It looks like you are building a huge one-lined string in scanFile, which would lead to repeated resizing of the read buffer later on.
I played around with the code some more and the above suggestion does not really improve the performance as much as I thought. There is still too much copying of string contents happening.
Using the indexOf/substring loop mentioned on hackernews gets the performance to about 2x the original, but substring is still creating copies. (This actually changed in java7, earlier it would create a view holding on to the original string contents which was deemed to be bad for memory consumption.)
Using subsequence and changing parseLine to accept a CharSequence sounds like it should work, but behaves exactly the same as substring due to backwards compatibility, the subsequence method just delegates to substring.
The one thing that gave a huge improvement was to implement a custom CharSequence implementation which does no copying and create that in an indexof loop. With that approach I finally got to about 2GB/s on this haswell laptop.
So I completely agree with your point, java can be fast, but you’d have to know exactly what you’re doing. And often the standard library works against you.
Further comments on this post at https://news.ycombinator.com/item?id=20542023 include a potential optimization to the benchmark which might reduce the slowdown from 4x to just twice as slow.
Nathan Kurzsays:
I’m Java illiterate, but there’s a comment on HN (https://news.ycombinator.com/item?id=20542438) that suggests you might not be measuring what you think you are measuring. Specifically, the author says that the call to lines() in your preprocessing step (L19) strips all the newlines, so that when you concatenate the results together with append() you are creating a single 23MB “line”. I’m not sure if it affects your conclusion, but given that your benchmarking is over a foreach loop, presumably this wasn’t your intent?
It was also suggested (I think usefully) that a few more details about the test environment would be helpful to evaluate your result. While you mention it in the linked earlier post, it would probably help to say again which machine, which version of Java, which C++ compiler, and so forth so that the post is more standalone.
you might not be measuring what you think you are measuring
There was a typo in an earlier version of my code, but this was quickly corrected. It turns out not to affect the result… or, at least, not the conclusion.
which version of Java, which C++ compiler, and so forth so that the
post is more standalone.
I’ll add more details but I think that this is somewhat nitpicking unless one can show that they consistently get 3 GB/s parsing text files in Java. That is, I provide an example that I view as ‘representative’ or ‘credible’.
It seems that streams have some overhead, maybe… but it is small.
Tagir Valeevsays:
Please note that BufferedReader.lines is smarter than getline: it supports any line delimiters: ‘\n’, ‘\r’ or ‘\r\n’ while getline supports only ‘\n’. Clearly having more tests per each character adds some overhead. Though I would pay this overhead, rather than having garbage result if the input file comes from Windows. As it was mentioned above, use String.split(‘\n’) if you specifically need ‘\n’.
Tagir Valeevsays:
Oh, sorry, String.split(‘\n’) was not mentioned above, and probably it’s not the best solution as it would allocate all the strings at once.
Ismael Jumasays:
FYI, a small improvement was done in OpenJDK as a result of this blog post:
You might want to retry using an StringBuilder rather than a mutex locked StringBuffer. This should get you serious performance increase.
I am not benchmarking anything having to do with a StringBuffer.
BufferedReader internally uses a StringBuffer, which from my reading can be safely swapped out for a StringBuilder.
I seem to get an improvement in performance from using a patched BufferedReader with that change.
Code to be published later:
Benchmark Mode Cnt Score Error Units
MyBenchmark.stdLibBufferedReader thrpt 25 29.542 ± 0.599 ops/s
MyBenchmark.patchedStdLibBufferedReader thrpt 25 33.426 ± 0.108 ops/s
MyBenchmark.stringLines thrpt 25 87.141 ± 1.155 ops/s
I’ll check the OpenJDK project to see whether that’s a reasonable change.
It’s an overdue change.
Let me know if assistance is needed — the Java NIO Package is using BufferedReader as well and the Mutex locks in StringBuffer is causing nasty and unnecessary performance hits.
Published my code.
Still waiting for the thing I reported on https://bugs.java.com/ to be reviewed and published.
BufferedReader is probably the most common way of reading files but it is also the slowest as shown by Martin Thompson: https://mechanical-sympathy.blogspot.com/2011/12/java-sequential-io-performance.html
It is certainly not the slowest! Using Scanner over a raw file is far slower.
That’s fair. Should have said one of the slowest.
“the default buffer size is 8192 characters capacity. Line size is considered as 80 chars capacity.”
form the javaDoc:
http://www.docjar.com/html/api/java/io/BufferedReader.java.html
you maybe resize it.
Good catch. I bet that would speed it up.
The ideal buffer size and line size aren’t really related. The buffer size is for reading large(ish) chunks of the file which are then parsed into lines. Selecting the buffer size is mostly a function of system call overhead versus the desire to keep stuff in L1. I have repeatedly found 8K to be a sweet spot, although that was pre-Meltdown/Spectre which might have pushed the ideal buffer size up.
You should try resizing the buffer to different sizes and rerun the benchmark. There is no magic behind buffered reader, everything is synchronous and it’s filling with data once the buffer is emptied.
Using String.lines in JDK 11 gives me 2 or 3 times better performance.
What’s the point of this post? Java is not easier to write [than C or C++], probably harder, one needs [to have installed] a VM and it’s slow, so what’s to like?
Java is much easier to write than C or C++ (for one, you don’t have to manually manage memory, and for second, it has much less corner points and nuances than C++), it’s plenty fast for most tasks (and on par with C/C++ on some), and installing a VM is a non issue.
So you comment in wrong in each and every statement it makes…
Using RAII and smart pointers does away with manual memory management, forget C with classes, we moved on from there.
Yes, it’s subtle and one needs to master it, it does not in itself mean it’s hard to write [and it got simpler to write fast code since C++1 and following std updates].
I don’t need to ask my user to install the JVM, that seems like a major advantage.
Forgot the most important bit, it is 2 times [with the optimizations in some of the other answers, 4 as posted] than plain C++. That’s the difference between google needing a mere 500’000 servers [to conduct its business] as compared to 1’000’000.
What is the point of your comment? I get you don’t like Java, but it’s hardly relevant to the speed of I/O.
Folks like you have been providing about 1/2 the work that I get, so THANKS! Java WAS slow, in the 90’s and early 2000’s. JIT around 2000 & the Runtime Profiler’s and Optimizers introduced around 2005 made Java quite performant. And there are lots of “extras” that the VM is providing if you are doing anything that needs Database, XML, WebServices, or any other non-trivial application.
Ever tried Files.readAllLines? enter link description here
In contrast to the C getline function, BufferedReader.readLine and BufferedReader.lines does not include the newline character in the returned strings. It looks like you are building a huge one-lined string in scanFile, which would lead to repeated resizing of the read buffer later on.
I played around with the code some more and the above suggestion does not really improve the performance as much as I thought. There is still too much copying of string contents happening.
Using the indexOf/substring loop mentioned on hackernews gets the performance to about 2x the original, but substring is still creating copies. (This actually changed in java7, earlier it would create a view holding on to the original string contents which was deemed to be bad for memory consumption.)
Using subsequence and changing parseLine to accept a CharSequence sounds like it should work, but behaves exactly the same as substring due to backwards compatibility, the subsequence method just delegates to substring.
The one thing that gave a huge improvement was to implement a custom CharSequence implementation which does no copying and create that in an indexof loop. With that approach I finally got to about 2GB/s on this haswell laptop.
So I completely agree with your point, java can be fast, but you’d have to know exactly what you’re doing. And often the standard library works against you.
Modified code is available at https://gist.github.com/jhorstmann/9dcdc3c26a26e4ad6f513128942a47d9
Thanks for sharing your code.
Further comments on this post at https://news.ycombinator.com/item?id=20542023 include a potential optimization to the benchmark which might reduce the slowdown from 4x to just twice as slow.
I’m Java illiterate, but there’s a comment on HN (https://news.ycombinator.com/item?id=20542438) that suggests you might not be measuring what you think you are measuring. Specifically, the author says that the call to lines() in your preprocessing step (L19) strips all the newlines, so that when you concatenate the results together with append() you are creating a single 23MB “line”. I’m not sure if it affects your conclusion, but given that your benchmarking is over a foreach loop, presumably this wasn’t your intent?
It was also suggested (I think usefully) that a few more details about the test environment would be helpful to evaluate your result. While you mention it in the linked earlier post, it would probably help to say again which machine, which version of Java, which C++ compiler, and so forth so that the post is more standalone.
There was a typo in an earlier version of my code, but this was quickly corrected. It turns out not to affect the result… or, at least, not the conclusion.
I’ll add more details but I think that this is somewhat nitpicking unless one can show that they consistently get 3 GB/s parsing text files in Java. That is, I provide an example that I view as ‘representative’ or ‘credible’.
Related…
http://bannister.us/weblog/2008/why-fileinputstream-is-slow
Thanks for sharing. Note that we have much faster disks today than in 2008.
Have you tried?
String s = null; while( (s = br.readLine()) != null) { parseLine(s); }
I wonder how much overhead there in the streams.
Yes… see the code repository.
It seems that streams have some overhead, maybe… but it is small.
Please note that BufferedReader.lines is smarter than getline: it supports any line delimiters: ‘\n’, ‘\r’ or ‘\r\n’ while getline supports only ‘\n’. Clearly having more tests per each character adds some overhead. Though I would pay this overhead, rather than having garbage result if the input file comes from Windows. As it was mentioned above, use String.split(‘\n’) if you specifically need ‘\n’.
Oh, sorry, String.split(‘\n’) was not mentioned above, and probably it’s not the best solution as it would allocate all the strings at once.
FYI, a small improvement was done in OpenJDK as a result of this blog post:
https://bugs.openjdk.java.net/browse/JDK-8229022
Ismael
Thanks for the pointer.