Daniel Lemire's blog

, 2 min read

Sorting 1 terabyte in 209 seconds

2 thoughts on “Sorting 1 terabyte in 209 seconds”

  1. Kevembuangga says:

    The Terabyte sort seem pretty silly, of course throwing a shitload of ressources at a problem is bound to give “impressive results” but where is the benefit for the average user?
    i.e your 6000 seconds sort.
    This looks like the Formula 1 racing which is supposed to further technological progress and which does once in a while, but at which cost?
    The Penny sort on the same page seem more sensible.
    BTW, from experience with my linear sort the 6000 seconds you report for 2Gb fall within plausible range of elapsed time due to disk access latency when sorted records are shuffled around, not a compute bound limit, you might check it.

  2. The 6000 seconds is definitively not “internal memory” since the whole machine has 2 GiB of RAM and it tries to sort 2 GiB of data. So there is quite a bit of IO overhead. Sure.