7th February 2023, 7 min read

Bit Hacking (with Go code)

9 thoughts on “Bit Hacking (with Go code)”

Marcin Zukowski says:

February 8, 2023 at 10:04 pm

Interesting article, skimmed through it, lots of known techniques that are worth publicizing more!

The discussion of the knowledge of the internal representation of values made me think back to something I stumbled on recently in one Intel library.
It was a way to compute leading binary zeroes for architectures that don’t have CLZ (I assume). Roughly this (simplified):

typedef union { int64_t ui64; double dbl; } doubleint;
int clz(int64_t x) { doubleint tmp; if (x >= 0x0020000000000000ull) { // x >= 2^53 // split the 64-bit value in two 32-bit halves to avoid rounding errors tmp.dbl = (double) (x >> 32); // exact conversion return 31 - ((((unsigned int) (tmp.ui64 >> 52)) & 0x7ff) - 0x3ff); } else { // if x < 2^53 tmp.dbl = (double) x; // exact conversion return 63 - ((((unsigned int) (tmp.ui64 >> 52)) & 0x7ff) - 0x3ff); } }

What happens here is that the author exploits the fact that double normalizes the mantisa, and then extracts the number of leading zeroes from the double's exponent.

Fast? Yes, if we don’t have CLZ. So no 🙂

Beautiful? For me, yes! 🙂
Daniel Lemire says:

February 8, 2023 at 10:29 pm

lots of known techniques that are worth publicizing more!

I think you can spend all your life being a very productive programmer without knowing any of this well. But I think you should know that it exists if you want to do serious programming.
Nevin says:

February 9, 2023 at 12:22 am

the number 1<<12 has 64-12=52 leading zeros followed by a 1 with 11 trailing zeros.

This is off-by-one.
1. Daniel Lemire says:
  
  February 9, 2023 at 4:08 am
  
  Much appreciated.
Stefan Kanthak says:

February 9, 2023 at 7:07 am

“Once we have determined that we have a value that might correspond to a surrogate pair, we may check that the first value x1 is valid (in the range 0xd800 to 0xdbff) with the condition (x-0xd800)<=0x3ff, and similarly for the second value x2: (x-0xdc00)<=0x3ff. ”

An initial (separate) test (x-0xd800) <=x7ff) of high and low code unit is superfluous; ((x-0xd800)<=0x3ff) && ((x-0xdc00)<=0x3ff) does the job.
What you but failed to mention is: the latter can be simplified to ((x-0xd800)|(x-0xdc00))<=0x3ff eventually saving the conditional branch for &&
Stefan Kanthal says:

February 9, 2023 at 7:42 am

“We may then reconstruct the code point as ((x-0xd800)<<10) + x-0xdc00.”

Nope! The code point is (1<<20) + ((x-0xd800)<<10) + x-0xdc00 or the equivalent (1<<20) | ((x-0xd800)<<10) | (x-0xdc00).
1. Daniel Lemire says:
  
  February 9, 2023 at 1:47 pm
  
  Thanks for catching the mistake !
BUGFIX-66 says:

February 10, 2023 at 3:15 pm

A variety of interesting and useful bitwise trick and techniques from Hacker’s Delight and The Art of Computer Programming (section 7.1.3) are available as Go programming puzzles here: https://BUGFIX-66.com (Enjoy!)
shogg says:

February 21, 2023 at 9:39 pm

A famous source for bit hacks is: https://graphics.stanford.edu/~seander/bithacks.html

I applied one of them here in a programming challenge: https://github.com/plutov/practice-go/blob/fa1e3e4b874f6ce7844175ce5fd6546dcd48c544/brokennode/brokennode.go#L86

Great success.