Daniel Lemire's blog

Getting serious about online teaching

, 2 min read

Earlier this month, Michael Mitzenmacher told us about the record number of students attending his Harvard class online-only. Yesterday, Dick Lipton predicted that online learning will replace campus learning : “I see no reason that On [Online Universities] could not do as good a job as Un…

You know your research is original when…

, 1 min read

Many consider Frank Hebert’s Dune the most important work of science-fiction ever written. Consider that Star Wars is just a variation on Dune. Yet, it was rejected by more than twenty publishers, before being finally published. It is likely that publishers rejected Dune precisely because it was…

Writing tools to improve your research productivity

, 2 min read

Researchers—at least in Computer Science—spend most of their days at a desk typing. Picking the right software for writing is important. Most of my writing time is spent on LaTeX documents. I have tried typical Word processors in the past, but they get in my way. Indeed, by mixing…

The fundamental properties of computing

, 1 min read

Physics works with fundamental properties such as mass, speed, acceleration, energy, and so on. Quantum mechanics has a well known trade-off between position and momentum: you can know where I am, or how fast I am going, but not both at the same time. Algorithms (and their implementations) also…

Actual programming with HTML and CSS (without javascript)

, 1 min read

I usually stick with academic or research issues, but today, I wanted to have some fun. Geek fun. While W3C describes Cascading Style Sheets (CSS) as a mechanism for adding style (e.g. fonts, colors, spacing) to Web documents, it is also a bona fide programming language. In fact, it is one of the…

The end of `mass universities´

, 2 min read

In the late sixties and seventies, we wanted universities to become more accessible. We founded the Open University, the Université du Québec, and many other universities with accessibility as part of their mandate. The stated goal was to make degrees more accessible. We succeeded. Yet, we are…

Database Questions for 2010: What´s On My Mind

, 1 min read

I started 2009 with an interest in Web 2.0 OLAP and collaborative data processing. The field of collaborative data processing has progressed tremendously. Last year, we got Google Fusion Tables and data warehousing products are getting more collaborative. In 2010, my research might focus more on…

My best blog posts (2009)

, 1 min read

As year 2009 comes to an end, I selected a few of my best blog posts. Database, compression and column stores: More database compression means more speed? Right? Trading compression for speed with vectorization Column stores and row stores: should you care? Changing your perspective: horizontal,…

Entropy-efficient Computing

, 1 min read

Microprocessors and storage devices are subject to the second law of thermodynamics: using them turn usable energy (oil, hydrogen) into unusable energy (heat). Data centers are already limited by their power usage and heat production. Moreover, many new devices need to operate for a long time with…

Run-length encoding (part 3)

, 2 min read

In Run-length encoding (part 1), I presented the various run-length encoding formats. In part 2, I discussed the coding of the counters. In this third part, I want to discuss the ordering of the elements. Indeed, the compression efficiency of run-length encoding depends on the ordering. For…

Why you should be a global warming skeptic

, 2 min read

The debacle of the leaked emails, data and code from the University of East Anglia showed that reputed global warming scientists were petty and cheaters. As always, the pursuit of excellence is often at the expense of rigor. To put a stop to growing skepticism, Scientific American published Seven…

Run-length encoding (part 2)

, 3 min read

(This is a follow-up to my previous blog post, there is also a follow-up: part 3.) Any run-length encoding requires you to store the number of repetitions. In my example, AAABBBBBZWWK becomes 3A-5B-1Z-2W-1K, we must store 5 counters (3,5,1,2,1) and 5 characters. Storing counters using a fixed…

Run-length encoding (part I)

, 3 min read

(This is part 1, there is also a part 2 and a part 3.) Run-length encoding (RLE) is probably the most important and fundamental string compression technique. Countless multimedia formats and protocols use one form or RLE compression or another. RLE is also deceptively simple. It represents repeated…

More database compression means more speed? Right?

, 2 min read

Current practical database compression techniques stress speed over compression: Vectorwise is using Super-scalar RAM-CPU cache compression which includes a carefully implemented dictionary coder. C-store—and presumably Vertica—is using similar compression techniques as well as…

Which should you pick: a bitmap index or a B-tree?

, 1 min read

Morteza Zaker sent me pointer to their work comparing bitmap indexes and B-trees in the Oracle database. They examine the folklore surrounding bitmap indexes—which are often thought to be mostly useful over low cardinality columns (columns having few distinct values, such as gender). Their…

Procrastination can be your friend

, 1 min read

Procrastination can be a serious problem leading to job loss, high anxiety and even significant psychological disability and dysfunction (according to wikipedia). To avoid excessive procrastination, most researchers grow a sense of professional urgency. Most people rely on extrinsic pressures. In…

Reading recommendation: Saturn´s children by Charles Stross

, 1 min read

I just finished Saturn’s children. This is my third Charles Stross novel after Accelerando and  Glasshouse. Saturn’s children presents itself as a light space opera novel. The hero is a robot-sex-slave who is running for her life, in a post-human world. The author does a great job of making…

Top 25 Canadian Universities by Research Funding (2009)

, 1 min read

University of Toronto (where I got my B.Sc. and M.Sc.) University of Alberta University of British Columbia Université de Montréal (where I got my Ph.D.) McGill University McMaster University Université Laval University of Ottawa University of Calgary University of Western Ontario University of…

The secret behind radical innovation

, 1 min read

Our global knowledge grows in slow, incremental steps. Darwin and Einstein mostly reinterpreted existing ideas. However, practical implementations sometimes take the world by storm. You might think that the experts are responsible for changing the world. Unfortunately, experts are not good at…

Become independent of peer review

, 2 min read

When I asked the director of a large—and successful—British software house his most serious problem, he said without hesitation “how to prevent clusters of incompetence from emerging”. I was reminded of that when I noticed the—for me unusual—weight given to the “peer review”. What,…