Daniel Lemire's blog

, 2 min read

The dangers of AVX-512 throttling: a 3% impact on Xeon Gold processors?

Intel’s latest processors come with powerful new instructions from the AVX-512 family. These instructions operate over 512-bit registers. They use more power than regular (64-bit) instructions. Thus, on some Intel processors, the processor core that is using AVX-512 might run at a lower frequency, to keep the processor from overheating or using too much power.

Can we measure this effect?

In a recent post, I used a benchmark provided by Vlad Krasnov from Cloudfare, on a Xeon Gold 5120 processor. In the test provided by Krasnov, the use of AVX-512 actually made things faster.

So I just went back to an earlier benchmark I designed myself. It is a CPU-intensive Mandelbrot computation, with very few bogus AVX-512 instructions thrown in (about 32,000). The idea is that if AVX-512 cause frequency throttling, the whole computation will be slowed. I use two types of AVX-512 instructions: light (additions) and heavy (multiplications).

I measured the effect of AVX-512 throttling on the Skylake X server I own… but what about the Xeon Gold 5120 processor?

I run the benchmark ten times and measure the wall-clock time using the Linux/bash time command. I sleep 2 seconds after each sequence of ten tests. A complete script is provided. In practice, I just run the benchmark.sh script after typing make and I record the user timings of each test.

Here are my raw numbers from one run (there are run-run variations in the 1-2% range):

No AVX-512 Light AVX-512 Heavy AVX-512
9.43 s 9.84 s 9.78 s

Thus AVX-512 incurs a 3-4% penalty. I can’t measure a difference between light and heavy AVX-512 instructions.

Is that a lot? It is hard for me to get terribly depressed at the fact that a benchmark I specifically designed to make AVX-512 look bad sees a 3% performance degradation on one core. Real code is not going to use AVX-512 in such a manner: the AVX-512 instructions will do useful work. It is not super difficult to recoup a 3% difference.

So single-core and sporadic usage of AVX-512 instructions looks to be harmless. You have to use multiple cores to get in real trouble.

Further reading: AVX-512: when and how to use these new instructions