20th September 2019, 4 min read

How far can you scale interleaved binary searches?

7 thoughts on “How far can you scale interleaved binary searches?”

Wes Felter says:

September 20, 2019 at 10:00 pm

Do you really mean page faults or TLB misses? You might also validate this by comparing the TLB reach of the processor to the data size.
1. Daniel Lemire says:
  
  September 20, 2019 at 10:25 pm
  
  Do you really mean page faults or TLB misses?
  
  I meant TLB misses, thanks. I tried to avoid to introduce the concept of TLB in the post, hence the confusion.
Peter F. says:

September 22, 2019 at 6:52 pm

Isn’t the limit due to memory level parallelism, you wrote about earlier?

https://lemire.me/blog/2018/11/13/memory-level-parallelism-intel-skylake-versus-apple-a12-a12x/
1. Daniel Lemire says:
  
  September 22, 2019 at 7:59 pm
  
  But we are still a way off from the level of parallelism we should have.
2. Oren Tirosh says:
  
  October 4, 2019 at 7:40 am
  
  What about core or hyperthread level parallelism? Does it share the same resources and limits? Some of them?
  1. Daniel Lemire says:
    
    October 4, 2019 at 7:50 am
    
    You can have more than one thread per core to make better use of your memory ressources but you are going to have to pay the overhead of threads…
Oren Tirosh says:

October 4, 2019 at 8:08 am

Using cores or hyperthreads can be used to learn the limits to know whether further optimization is theoretically possible – and predict whether this technique might be useful in boosting the overall performance of a parallel application or just squeeze out single-core benchmarks and unique scenarios.