1) Bitmap decoding should be cheap compared to the cost of computing an intersection, but it is still a significant overhead. Thankfully, you can reduce this overhead with fast bitmap decoding algorithms (see http://lemire.me/blog/archives/2012/05/21/fast-bitmap-decoding/).
A good question to ask is how few bits we have to examine to construct the intersection of 2 or more vectors, given any choice of encoding for the vectors (without pre-storing intersections, etc.).
I looked at this problem some years ago and decided that (delta and) Golomb-Rice coding could be rearranged into a sort of bucketed bitmap arrangement, where the range is broken into buckets of M consecutive int’s, represented in a bitmap (corresponding the unary parts of the coding), and an array of offsets within each occupied bucket, encoded in truncated binary as in the original.
For example, a sorted vector of int’s in the range 0-127, with a Golomb-Rice bucket size of 8, would be broken into a 16-bit bitmap (buckets 0-7,8-15, …,120-127), and an array of offsets r < 8 within those buckets, in truncated binary (since a bucket can contain more than one value, the offsets also need a bit to indicate "consecutive in the same bucket" or some such).
Calculating the intersection is a two-step process of first and-ing the bitmaps, and then fetching and comparing offsets in any bucket where a 1 remains. Since the offsets may be slow to fetch, the idea is to use the bitmaps as much as possible.
The optimality of Golomb-Rice means the average bitmap will be about 50% sparse, and n-way intersections can get 2^-n sparsity before any bucket contents need to be examined.
I never decided on the best way to store the offsets, but this seemed like a good way to size the summary bitmaps.
I just love the way you treat simple problems into fun analysis!
KWilletssays:
No, I did some fiddling with it, but my employer at the time wasn’t interested, and I didn’t have the time to flesh it out fully. It might be fun to code up a simple benchmark.
Clearing the sparse bitmap can take a while and that’s a pain if it’s a frequent operation. Russ Cox gives a nice explanation of an old technique for using uninitialised memory for a sparse set that allows a fast clear with slightly higher cost to the set and test operations. http://research.swtch.com/sparse
Calumsays:
I read somewhere that the most efficient data structure to represent a deck of playing cards is actually a bitmap, perhaps organised as 4*16-bit numbers representing the suits. Then it becomes much simpler/faster to detect various hands using bitwise operations.
This is most useful in AI where a lot of eventualities need to be computed quickly.
There are some downsides to the bitmap approach: you first have to construct the bitmaps and then you have to extract the set bits.
If you do a bunch of intersections, you end up with few non-zero bits/bytes. Hence, extraction may be quite cheap.
PS: Is JavaEH a sparse bitmap?
@Itman
1) Bitmap decoding should be cheap compared to the cost of computing an intersection, but it is still a significant overhead. Thankfully, you can reduce this overhead with fast bitmap decoding algorithms (see http://lemire.me/blog/archives/2012/05/21/fast-bitmap-decoding/).
2) Yes, JavaEWAH supports sparse bitmaps. There is also a C++ counterpart (https://github.com/lemire/EWAHBoolArray) but JavaEWAH is more popular.
A good question to ask is how few bits we have to examine to construct the intersection of 2 or more vectors, given any choice of encoding for the vectors (without pre-storing intersections, etc.).
I looked at this problem some years ago and decided that (delta and) Golomb-Rice coding could be rearranged into a sort of bucketed bitmap arrangement, where the range is broken into buckets of M consecutive int’s, represented in a bitmap (corresponding the unary parts of the coding), and an array of offsets within each occupied bucket, encoded in truncated binary as in the original.
For example, a sorted vector of int’s in the range 0-127, with a Golomb-Rice bucket size of 8, would be broken into a 16-bit bitmap (buckets 0-7,8-15, …,120-127), and an array of offsets r < 8 within those buckets, in truncated binary (since a bucket can contain more than one value, the offsets also need a bit to indicate "consecutive in the same bucket" or some such).
Calculating the intersection is a two-step process of first and-ing the bitmaps, and then fetching and comparing offsets in any bucket where a 1 remains. Since the offsets may be slow to fetch, the idea is to use the bitmaps as much as possible.
The optimality of Golomb-Rice means the average bitmap will be about 50% sparse, and n-way intersections can get 2^-n sparsity before any bucket contents need to be examined.
I never decided on the best way to store the offsets, but this seemed like a good way to size the summary bitmaps.
You have basically just described Elias-Fano coding, which is provably optimal. See these slides for a nice intro: http://shonan.nii.ac.jp/seminar/029/wp-content/uploads/sites/12/2013/07/Sebastiano_Shonan.pdf
@KWillets
It is a reasonable approach. Have you tried implementing it?
I just love the way you treat simple problems into fun analysis!
No, I did some fiddling with it, but my employer at the time wasn’t interested, and I didn’t have the time to flesh it out fully. It might be fun to code up a simple benchmark.
@KWillets
Get in touch with me if you do.
It might be worth a shot.
It’s also easier in Rice coding, because the offsets/remainders are fixed width.
Clearing the sparse bitmap can take a while and that’s a pain if it’s a frequent operation. Russ Cox gives a nice explanation of an old technique for using uninitialised memory for a sparse set that allows a fast clear with slightly higher cost to the set and test operations. http://research.swtch.com/sparse
I read somewhere that the most efficient data structure to represent a deck of playing cards is actually a bitmap, perhaps organised as 4*16-bit numbers representing the suits. Then it becomes much simpler/faster to detect various hands using bitwise operations.
This is most useful in AI where a lot of eventualities need to be computed quickly.
Any idea how EWAH compare in speed or size to “COmpressed ‘N’ Composable Integer SET”
see: http://ricerca.mat.uniroma3.it/users/colanton/concise.html
I am trying to heuristic that would let a database engine decide which implementation to use.
@Maxime
I have written a benchmark to compare the various bitmap libraries, including Concise.
Please go to
https://github.com/lemire/simplebitmapbenchmark
If you are interested in the results I get on my machine, you can browse:
https://github.com/lemire/simplebitmapbenchmark/blob/master/RESULTS
As usual, there are trade-offs.
If you want to run the test yourself, grab a copy and execute the bash script “run.sh” (or the Windows equivalent).
If you want to hack the code or add your own benchmarks, I’ll be very interested in any contribution.