Daniel Lemire's blog

, 1 min read

Multicore programming? Yawn!

It looks like Intel is trying to push parallel programming. No doubt many colleges are going to keep surfing on the parallel-programming hype — to predict a new surge of interest in Computer Science. Alas, there is no upcoming multicore revolution in computer programming.

  • For a large fraction of enterprise problems, the bottleneck is at the database level. The ubiquity of Web servers and distributed databases (see CouchDB) imply that many such problems are already parallelized. Database techniques like partitioning have been around for years to help you parallelize your databases. This blog runs on a server with several processors, and it has done so for years. Nothing new on the horizon.
  • MapReduce and Hadoop help you parallelize many of the remaining hard data processing problems without having to mess with threads, locking and synchronization.
  • Many hard problems are memory-bound: they are hard because all of the data does not fit in memory. If your problem is memory-bound or IO-bound, throwing more processing cores at it may not help at all.

I have stated for a couple of years that storage, not processing power, is changing Information Technology. What is most amazing is our ability to record almost every single bit of information, and never have to delete or forget anything. On this topic, see my posts One More Step Toward Infinite Storage, Solid-state drives: when external memory becomes as fast as internal memory and What is infinite storage?

The truth is that we are not very good at dealing with large quantities of data. Anyone knows what to do when handed 50 terabytes of raw data? Few of us have the required skills to manage and leverage extremely large databases. Those will be the valuable skills in the future.