25th April 2017, 5 min read

Quickly pruning elements in SIMD vectors using the simdprune library

9 thoughts on “Quickly pruning elements in SIMD vectors using the simdprune library”

Gianluca Della Vedova says:

April 26, 2017 at 9:02 am

In the example of the README of your project, shouldn’t the zero vector 0,0,0,0,0,0,0,0 be a one vector 1,…,1?
1. Daniel Lemire says:
  
  April 26, 2017 at 3:10 pm
  
  Yes. Fixed. Thank you.
KWillets says:

April 26, 2017 at 5:41 pm

There was a thread on stack exchange about packing left from a mask, and they recommended using PEXT to pull the bits. Would that work here?
1. Daniel Lemire says:
  
  April 26, 2017 at 5:53 pm
  
  Yes. It can be made to work. It might be very useful for pruning bytes because the current solution, with a large table, is not ideal.
  
  How to make it all come together for high efficiency is the tricky part.
  1. KWillets says:
    
    April 26, 2017 at 9:00 pm
    
    Here’s the thread: http://stackoverflow.com/questions/36932240/avx2-what-is-the-most-efficient-way-to-pack-left-based-on-a-mask
    
    They also mention VCOMPRESSPS for 32-bit values under AVX512.
    1. Daniel Lemire says:
      
      April 26, 2017 at 9:06 pm
      
      I am aware of vcompress and it is mentioned in the README of the library. It is not super useful because none of us has access to it.
      
      The BMI code is cool.
      1. Daniel Lemire says:
        
        April 26, 2017 at 10:55 pm
        
        I have added, for benchmarking purpose, the BMI approach and, in my tests, it is slower. The BMI instructions can be nice, but they often have high latency so if you string them with data dependencies, it is not always super efficient.
        
        KWillets says:
        
        May 2, 2017 at 9:39 pm
        
        Ryzen instructions just came out on Agner’s site, and PEXT/PDEP have reciprocal latency of 18 cycles. 🙁
        
        Daniel Lemire says:
        
        May 4, 2017 at 4:07 pm
        
        @KWillets
        
        That sounds bad. On the other hand, I am not sure that PEXT/PDEP is common in software, or even that it will become common.