I more than agree that writing a good (micro)benchmark is hard. And not only in Java, in C++ as well. Benchmarking a simplest function can become a hard problem when you face code alignment issues or other CPU architectural effects like caches, etc. Take a look at my recent post. https://dendibakh.github.io/blog/2018/01/18/Code_alignment_issues
Actually with JMH, I find it easier to write a micro-benchmark in Java. It solves most of the hard problems (within reason: if, for example, you allocate memory in your test you’ll always have significant variance), and offers you the tools at hand you need to sink results, measure latency, measure throughput, etc. You can have the tool annotate the generated assembly level for you with cycle counts, you can record performance counters, etc.
The tool has definitely been written by experts and it shows.
Other than that, I agree – any type of analysis of low-level performance in these higher level languages than run a in VM add a whole layer of complicating factors, on top of the ones that the OS and runtime already impose.
That last point is important: some of these issues exist even for stuff written in C or C++: transparent hugepages, for example, is essentially a garbage collector running asynchronously in the OS, possibly changing your virtual to physical mapping at runtime, sometimes giving you huge pages, and sometimes not, etc. You can see big run-to-run variation due to this. Of course, once you know about it, you can control it. This stuff is getting more common, not less.
I more than agree that writing a good (micro)benchmark is hard. And not only in Java, in C++ as well. Benchmarking a simplest function can become a hard problem when you face code alignment issues or other CPU architectural effects like caches, etc. Take a look at my recent post. https://dendibakh.github.io/blog/2018/01/18/Code_alignment_issues
It is a cool post.
Do you have RSS/Atom feed for your blog?
Actually with JMH, I find it easier to write a micro-benchmark in Java. It solves most of the hard problems (within reason: if, for example, you allocate memory in your test you’ll always have significant variance), and offers you the tools at hand you need to sink results, measure latency, measure throughput, etc. You can have the tool annotate the generated assembly level for you with cycle counts, you can record performance counters, etc.
The tool has definitely been written by experts and it shows.
Other than that, I agree – any type of analysis of low-level performance in these higher level languages than run a in VM add a whole layer of complicating factors, on top of the ones that the OS and runtime already impose.
That last point is important: some of these issues exist even for stuff written in C or C++: transparent hugepages, for example, is essentially a garbage collector running asynchronously in the OS, possibly changing your virtual to physical mapping at runtime, sometimes giving you huge pages, and sometimes not, etc. You can see big run-to-run variation due to this. Of course, once you know about it, you can control it. This stuff is getting more common, not less.
I agree that JMH makes it easier to write benchmarks in Java. I think that all my Java benchmarks have relied on JMH for years now.