Daniel Lemire's blog

, 1 min read

To improve your indexes: sort your tables!

Many database indexes, including bitmap indexes, are sensitive to the order of the rows in your table. Many data warehousing practitioners urge you to sort your tables to get better results, especially with Oracle systems. In fact, column-oriented database systems like Vertica are built on sorted tables.

What do I mean by sorting the rows? It turns out that finding the best row reordering is a NP-hard problem. You might as well use something cheap like lexicographical sort as a heuristic. Finding an equally scalable technique that work much better is probably impossible. Ah! But there are different ways to sort lexicographically!

We wrote a few papers on this issue including one for DOLAP 2008. Here are the slides of our presentation: Histogram-Aware Sorting for Enhanced Word-Aligned Compression in Bitmap Indexes View SlideShare presentation or Upload your own. (tags: dolarp2008)