Daniel Lemire's blog

Innovation as a Fringe Activity

, 18 min read

What do these people have in common: Marconi, Alexander Graham Bell, and the Steves Wozniak and Jobs? At least one commonality is that approximately nobody listened to them or cared about what they were doing until there was simply no way to ignore them anymore. In effect, they were living and…

Faster dictionary decoding with SIMD instructions

, 4 min read

A particularly fast and effective compression technique is dictionary coding. Intuitively, it works as follow. Suppose you are given a long document made of millions of words, but containing only 65536 distinct words. You can create a map from words to short integers or indexes (in [0,65536)). So…

How many reversible integer operations do you know?

, 2 min read

Most operations on a computer are not reversible… meaning that once done, you can never go back. For example, if you divide integers by 2 to get a new integer, some information is lost (whether the original number was odd or even). With fixed-size arithmetic, multiplying by two is also…

Let us talk about the Luddite problem…

, 8 min read

This morning I woke up to an interview on the radio (yes, I still have a radio somehow) with pharmacists who try to fulfill prescriptions by mail. This is 2016. I can probably get almost all the chemical compounds necessary to create most drugs delivered to my home from China… but I need to go…

Combine smart people with crazily hard projects

, 3 min read

Back in college, professors assigned crazily hard problems… and I was forced to talk with my peers to figure out how they fared… and eventually teaming up with some of them. The same pattern has repeated itself again and again in my life: smart people and really hard problems. That’s how I am…

Common sense in artificial intelligence… by 2026?

, 5 min read

Common sense is nothing more than a deposit of prejudices laid down in the mind before age eighteen (Albert Einstein) Lots of people want to judge machine intelligence based on human intelligence. It dates back to Turing who proposed his eponymous Turing test: can machines “pass” as human…

Accelerating PHP hashing by “unoptimizing” it

, 4 min read

Hashing is a software trick that can map strings to fixed-length integers, such as 32-bit integers. It is ubiquitous in modern software. Languages like Java and PHP have the ability to store strings with their corresponding hash values. Still, the hash value must be computed at least once. How much…

Augmented reality becomes mainstream

, 2 min read

My go-to reference lately about the near future has been the 2006 novel Rainbows End by Vernor Vinge. The novel is set in 2025 and the author depicts a world where augmented reality is ubiquitous. Kids still go to school, but instead of presenting the working of a turbine using a PowerPoint deck,…

Virtual Reality: First impressions with the HTC Vive

, 3 min read

I just got my hands on some virtual-reality (VR) goggles. Specifically, we have an “HTC Vive“. We are still in the early days of VR and given that these goggles cost about a $1000, not everyone will get to try them outside a demo room. I thought I’d share my impressions. There are…

Fast random shuffling

, 4 min read

In a random shuffle, you want to take the elements of a list and reorder them randomly. In a “fair” random shuffle, all possible permutations must be equally likely. It is surprisingly hard to come up with a fair algorithm. Thankfully, there is a fast and easy-to-implement algorithm: the…

A fast alternative to the modulo reduction

, 7 min read

Suppose you want to pick an integer at random in a set of N elements. Your computer has functions to generate random 32-bit integers, how do you transform such numbers into indexes no larger than N? Suppose you have a hash table with a capacity N. Again, you need to transform your hash values…

I do not use a debugger

, 7 min read

I learned to program with BASIC back when I was twelve. I would write elaborate programs and run them. Invariably, they would surprise me by failing to do what I expect. I would struggle for a time, but I’d eventually give up and just accept that whatever “bugs” I had created were there to…

How fast is tabulation-based hashing? The downsides of Zobrist…

, 2 min read

In practice, hashing is the process of taking an input, such as a string, and turning it into an integer value. It is a fundamental tool in programming, as most software relies on hashing in one way or another. We often expect hashing to appear “random” so that any two strings are unlikely to…

The strange case of the copyright of open-source software

, 2 min read

Economists make a grave mistake when they fail to mention open-source software as one of the critical innovation of our era. Open-source software offers a great reference for the type of innovation we will get in the post-industrial era. To economists, open-source software does not matter because…

To be smart, work on problems you care about

, 7 min read

Never do anything that bores you. My experience in science is that someone is always telling to do something that leaves you flat. Bad idea. I’m not good enough to do something I dislike. In fact, I find it hard enough to do something that I like. (James Watson) A high school student who is…

Enough with the bogus medical studies!

, 6 min read

Every week, we hear about how eating such and such food gives cancer, or how working out can save you from a heart attack. If you have been reading these studies with attention and changing your life to follow their recommendations… you have been wasting your time and you have possibly harmed…

The surprising cleverness of modern compilers

, 1 min read

I wanted to know how a modern C compiler like clang would process the following C code: #include <stdint.h> int count(uint64_t x) { int v = 0; while(x != 0) { x &= x - 1; v++; } return v; } Can you guess? popcntq %rdi, %rax That is right. A fairly sophisticated C…

We are passing the Turing test right on schedule

, 4 min read

In 1950, the brilliant computing pioneer Alan Turing made the following prediction in his paper Computing Machinery and Intelligence: I believe that in about fifty years’ time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game…

Professors intentionally slow down science to make themselves look better

, 4 min read

Recently, the president of the United States announced a big anti-cancer initiative, to be headed by his vice-president Joe Biden. Will it be fruitful? Maybe. But an important obstacle has become clear. Researchers are glad to receive funds from the government, but they do not want to share back…

Email: how to be polite and efficient

, 2 min read

Email is an old platform, but it still represents the cornerstone of most of our work online. Surprisingly, many people seem to be using email poorly. Here are a few basic rules to keep us productive. Long emails are inefficient because people do not read them. Angry emails should be used with…