Daniel Lemire's blog

, 2 min read

Data structure size and cache-line accesses

3 thoughts on “Data structure size and cache-line accesses”

  1. dirtysalt says:

    It’s quite bit counter-intuitive.

    Take size=2 for example, the offset could be any random value. If address span is [63, 64], then we touch 2 cache lines. So size = 2 can not be 1.0.

    I write a python script to compute expected cache line access

    def expected_access(size):
    C = 64 # Cache line size
    ss = []
    for off in range(0, C): # offset could be random value.
    a = off // C
    b = (off + size – 1) // C
    ss.append(b – a + 1) # access b-a+1 cache lines
    return sum(ss) / len(ss) # compute average

    and 2 bytes expected value is 1.015625, 32 bytes expected value is 1.484375.

    1. If you have a 2-byte value, and you lay it out in a packed array as described in the blog post, you will never have a 2-byte value covering two cache lines.

      I think that you are considering a different model where you put your values anywhere in memory (at a random address).

      Then I suspect that the probability of an overlap is min(1-(64-x+1)/64,1).

      1. dirtysalt says:

        Thanks, I misunderstand your model. I’ll try it later.