Daniel Lemire's blog

, 2 min read

ARM vs Intel on Amazon´s cloud: A URL Parsing Benchmark

Twitter user opdroid1234 remarked that they are getting more performance out of the ARM nodes than out of the Intel nodes on Amazon’s cloud (AWS).

I found previously that the Graviton 3 processors had less bandwidth than comparable Intel systems. However, I have not done much in terms of raw compute power.

The Intel processors have the crazily good AVX-512 instructions: ARM processors have nothing close except for dedicated accelerators. But what about more boring computing?

We wrote a fast URL parser in C++. It does not do anything beyond portable C++: no assembly language, no explicit SIMD instructions, etc.

Can the ARM processors parse URLs faster?

I am going to compare the following node types:

  • c6i.large: Intel Ice Lake (0.085 US$/hour)
  • c7g.large: Amazon Graviton 3 (0.0725 US$/hour)

I am using Ubuntu 22.04 on both nodes. I make sure that cmake, ICU and GNU G++ are installed.

I run the following routine:

  • git clone https://github.com/ada-url/ada
  • cd ada
  • cmake -B build -D ADA_BENCHMARKS=ON
  • cmake --build build
  • ./build/benchmarks/bench --benchmark_filter=Ada

The results are that the ARM processor is indeed slightly faster:

Intel Ice Lake 364 ns/url
Graviton 3 320 ns/url

The Graviton 3 processor is about 15% faster. It is not the 20% to 30% that opdroid1234 reports, but the Graviton 3 nodes are also slightly cheaper.

Please note that (1) I am presenting just one data point, I encourage you to run your own benchmarks (2) I am sure that opdroid1234 is being entirely truthful (3) I love all processors (Intel, ARM) equally (4) I am not claiming that ARM is better than Intel or AMD.

Note: I do not own stock in ARM, Intel or Amazon. I do not work for any of these companies.

Further reading: Optimized PyTorch 2.0 inference with AWS Graviton processors