Daniel Lemire's blog

I´m an introvert. And that´s ok.

, 2 min read

I’m an introvert. That’s why you don’t see me at meetings and celebrations. If you do, I’m in a corner looking awkward. That’s why I’m not trying to build a large laboratory of busy graduate students. That’s why I crave time alone to reflect and think, to write and code… I am not…

What happens when you get more Ph.D.s?

, 2 min read

Following the fall of the USSR, hundreds of world class mathematicians emigrated to the USA. Intuitively, this should have made American mathematics stronger. Did it? Borjas and Doran examined the problem. Their starting point was the realization that the expertise of Soviet mathematicians differed…

Bitmaps are surprisingly efficient

, 3 min read

Imagine you have to copy an array, and update a few values in the process. What is the most efficient implementation? Let us look at a concrete example. I am given this array: 0,5,1,4,5,1,10,4. I want to create a new array with these values: 0,5,1,40,5,1,100,4. Most programmers would follow the…

Effective compression using frame-of-reference and delta coding

, 4 min read

Most generic compression techniques are based on variations on run-length encoding (RLE) and Lempel-Ziv compression. Compared to these techniques and on the right data set, frame-of-reference and delta coding can be faster for a comparable compression rate. Mathematically, frame-of-reference and…

Two rules for teaching in the XXIst century

, 9 min read

Education in the XXth century has been primarily industrial: organize the workersstudents in groups under the supervision of a managerteacher. We all have been in such systems for so long that we take it for granted. How else is anyone to learn? Maybe some can learn differently, but most can’t…

How to revise research papers after receiving harsh reviews

, 2 min read

Whether you submit your work scientific journal or just post it on a blog, you can expect to receive harsh criticism from time to time. Sometimes you are facing arrogant or ignorant readers. Other times, your work is genuinely flawed. My own work is frequently flawed, as you know if you read this…

Open access journals in Computer Science

, 3 min read

Open access journals make articles freely available. Some of them even allow the authors to keep the copyright of their work. It would seem that they offer a compelling alternative to traditional journals, especially if you hope to reach to people outside academia. However, open access may allow…

Should you boycott academic publishers?

, 3 min read

There is a growing list of famous scientists who have pledged to boycott Elsevier as a publisher. If I were in charge of Elsevier, I would be very nervous: academic publishers need famous authors more than the famous authors need the publishers. After all, famous scientists could simply post their…

Use random hashing if you care about security?

, 4 min read

Hashing is a programming technique that maps objects (such as strings) to integers. It is a necessary component of hash tables, one of the most frequently used data structure in Computer Science. Typically, hash tables have the property that looking up or storing a value associated with a key…

Open science: why is it so hard?

, 4 min read

Open access is the idea that scholarship should be accessible to all. Many believe that we should require publicly funded researchers to make their work available to the public. That is, if some professor discovers a new algorithm or a new remedy while on a government grant, you should be able to…

Do we need patents?

, 3 min read

Whenever I suggest that patents are harmful, people point to the pharmaceutical industry. The pharmaceutical industry is heavily regulated. Marketing a new drug is a lengthy and expensive process. Moreover, drugs are subject to strict patent laws. The rationalization for patents usually goes like…

Are you a gold prospector, or a construction worker?

, 3 min read

Most work is akin to construction jobs: you work until the house is built. You just have to keep the servers running day after day. You keep writing code day after day. You teach another class. The only risk you take is that they may not be work for you next year. The work is not, in itself, risky.…

My favorite posts from 2011

, 1 min read

January: Innovating without permission Not even eventually consistent February: Taking scientific publishing to the next level Ten things Computer Science tells us about bureaucrats March: Know the biases of your operating system Breaking news: HTML+CSS is Turing complete April: How…

Compressing document-oriented databases by rewriting your documents

, 2 min read

The space utilization of relational databases can be estimated quickly. If you create a table made of three columns, each containing an integer, you can expect the database to use roughly 12 bytes per row, plus some overhead. Unless your database is tiny, how you name your columns is irrelevant to…

Dealing with harsh criticism

, 2 min read

Scott Adams, of Dilbert fame, once told how Dilbert fared poorly initially. His critics objected that Dilbert was hardly ever funny, except when he appeared at the office. Instead of falling prey to discouragement, Adams decided to portray Dilbert almost exclusively at the office from now on. And…

3 surprising facts about the computation of scalar products

, 2 min read

The speed of many algorithms depends on how quickly you can multiply matrices or compute distances. In turn, these computations depend on the scalar product. Given two arrays such as (1,2) and (5,3), the scalar product is the sum of products 1 × 5 + 2 × 3. We have strong incentives to…

Where do debt, credit and currencies come from?

, 3 min read

We often believe that primitive cultures lacked currencies and so they engaged in barter. Barter is awfully inconvenient and simply cannot sustain a non-trivial economy. Thus, we conclude that someone invented currencies out of convenience. Later, some astute fellow invented credit and loans. In…

Real scientists never report fraud

, 2 min read

Diederik Stapel has been a psychology professor at major universities for the last ten years. He published well over 100 research papers in prestigious journals such as Science. Some of his research papers have been highly cited. He trained nearly 20 Ph.D. students. It was recently fired when it…

My favorite LaTeX editor for MacOS: Texpad

, 4 min read

I always found word processors distracting. I hate to copy and paste text only to find that the text formatting was copied as well. When I write, I do not want to have to worry about typesetting or page formatting issues. And I prefer a straight text file as a file format: it plays nicely with…