Daniel Lemire's blog

, 5 min read

Memory-level parallelism : Intel Ice Lake versus Amazon Graviton 3

5 thoughts on “Memory-level parallelism : Intel Ice Lake versus Amazon Graviton 3”

  1. Markus Schaber says:

    Everything has a downside, though: Sometimes, the CPU does not know yet whether it needs to access certain memory (e. G. if the code is behind a conditional branch whose result depends on other memory which is still being in progress).
    Thus, the CPU speculatively reads the memory anyways, knowing the read may be wasted, but gaining speed in the other case.
    As the world is not ideal, those speculative reads have some side effects (e. G. a different timing of later reads because the data may be in the cache or not). And this has been exploited, google for “meltdown” and “spectre” vulnerabilities.
    (The explanation here is simplified.)

  2. Joe Duarte says:

    Hi Daniel, the Graviton 3 uses DDR5 RAM, and the Ice Lake uses DDR4. How do you think that shapes these results?

    Interesting bit here: “Out in memory, Graviton 3 noticeably regress in latency compared to Ampere Altra and Graviton 2. That’s likely due to DDR5, which has worse latency characteristics than DDR4. Graviton 3 also places memory controllers on separate IO chiplets. That could exacerbate DDR5’s latency issues.”

    From: https://chipsandcheese.com/2022/05/29/graviton-3-first-impressions/

    I remember this paper on page sizes. I wonder what the impact would be on this kind of pointer chasing test:

    P. Weisberg and Y. Wiseman, “Using 4KB page size for Virtual Memory is obsolete,” 2009 IEEE International Conference on Information Reuse & Integration, 2009, pp. 262-265, doi: 10.1109/IRI.2009.5211562.

    1. My post has this comment: “I chose not to tweak the page size for these experiments.” We know that increasing the page size would improve matters. I chose not to play with it.

      I would agree that 4kB should be obsolete, but it is not up to me to change the default.

      Regarding DDR5… this might very well be a factor in the random-access bandwidth. It is likely that the Intel servers have mature DDR4 with low latency.

      1. Joe Duarte says:

        Yes, I was responding to your comment about 4 KB pages in my comment on the issue. I should’ve made that clear. Since it’s just a software setting, I’m not sure why we’d care about defaults. Lots of people run Linux servers with Large Pages or the giant or jumbo pages setting. I’m not sure if exact page sizes can be set in Linux, Windows Server, and FreeBSD, but those researchers found 16 KB to be the sweet spot.

        1. I’m not sure why we’d care about defaults.

          My expectation is that most people adopt the defaults, whatever their operating system is. So, yes, I care about the default for page sizes and I rarely change them myself.