Daniel Lemire's blog

What is “modern” programming?

, 6 min read

As a young teenager, I dabbled with basic and some assembly. Things got serious when I learned Turbo Pascal. “Now we are talking”, I thought. Turbo Pascal offered one of the earliest examples of Integrated Development Environment (IDE). In effect, an IDE is a program that lets you conveniently…

Science and Technology links (July 14th, 2017)

, 4 min read

PC shipments are at the lowest level of the last 10 years, and they have been declining for the last two years. Using smartphone data, researchers are able for the first in history to measure objectively how active people are. Lots of prior research relied on questionnaires, but self-reported…

Pruning spaces faster on ARM processors with Vector Table Lookups

, 4 min read

Last week, I asked how fast one could remove spaces from a string using ARM processors. On a particular benchmark, I got 2.4 cycles per byte using regular (scalar) code and as little as 1.8 cycles per byte using ARM NEON instructions. These are “vectorized instructions” that you find in…

Are your strings immutable?

, 3 min read

A value is immutable if it cannot change. Immutability is a distinct notion than that of a constant. The speed of light in a vacuum is believed to be a universal constant, for example. Constants are immutable in the sense that they cannot change. However, immutability refers to values, not to the…

Science and Technology links (July 7th, 2017)

, 2 min read

People magazine recently named Julia Roberts, who is 49, as the World’s Most Beautiful Woman. Volvo plans to commercialize self-driving cars in 2020, and all electric by 2019. France will ban petrol cars in 2040. The Fermi paradox is the idea that we ought to see intelligent life in the universe,…

Pruning spaces from strings quickly on ARM processors

, 4 min read

Suppose that I give you a relatively long string and you want to remove all spaces from it. In ASCII, we can define spaces as the space character (‘ ‘), and the line ending characters (‘\r’ and ‘\n’). I am mostly interested in algorithmic and performance issues, so we can simplify the…

Science and Technology links (July 1st, 2017)

, 4 min read

Canada is 150 years old today. The iPhone is 10 years old this year. We can safely say that the iPhone 7 is over a hundred times faster, in almost every way than the original iPhone. Very few things get 100 times better over 10 years. You have to improve the performance by 60% each and every…

Video game review… Nier: Automata

, 5 min read

Single-player RPG games are having a tough time. Last year I reviewed Deus Ex: Mankind Divided. Though I felt it was an excellent game, it was not a commercial success and it seems that there will not be a follow-up game in the series in the foreseeable future. More recently, I reviewed Mass…

Science and Technology links (June 23rd, 2017)

, 1 min read

Elon Musk, Making Humans a Multi-Planetary Species, New Space. June 2017, 5(2): 46-61. Reportedly, Ikea is working on augmented reality software that would allow you to see the furniture in your home before buying it. Current virtual reality headsets provide a good experience, but if you ever tried…

Top speed for top-k queries

, 4 min read

Suppose that I give you a stream of values and you want to maintain the top-k smallest or largest values. You could accumulate all these values, sort them all and keep the smallest k values. You can do somewhat better by using a binary heap to construct a priority queue. And you can also use the…

Science and Technology links (June 16th, 2017)

, 4 min read

How much bandwidth do we have? It seems that each of our eyes has 1 megabyte per second. That’s about 100GB per day assuming you close one of your eyes and you never sleep. I found a fascinating article on “academic urban legends” including the myth that spinach is a good source of…

QuickSelect versus binary heap for top-k queries

, 3 min read

In a previous post, I considered the problem of finding the k smallest (or k largest) elements from a stream of values. The naive approach is to collect all the values in an array, sort the array, and return the first k values. When an index is not available, that is how some relational databases…

Science and Technology links (June 9th, 2017)

, 4 min read

This week, Apple told us about two new interesting pieces of technology. On the one hand, we have ARKit. It makes it easy for developers to build and deploy “augmented reality” applications to the iPhone in your pocket. What is augmented reality? In the visual space, it is a way to add…

Quickly returning the top-k elements: computer science vs. the real world

, 2 min read

A standard problem in computer science is to quickly return the smallest or largest K elements of a list. If the list is made of N elements, then we can solve this problem in time N log (K) using a binary heap. How would the average programmer solve this problem? My guess is that unless K is 1,…

Science and Technology links (June 2nd, 2017)

, 2 min read

Methylene blue rejuvenates old skin. You can buy methylene blue on Amazon and spray it on your face. It is reportedly safe, even in high concentration. Of course, you are likely to turn blue. A school in the UK is offering a degree in eSport (video games). There are 20 CRISPR (gene editing) clinic…

Unsigned vs. signed integer arithmetic

, 4 min read

Given any non-negative integers x and d, we can write uniquely x = q d + r where q (the quotient) and r (the remainder) are non-negative and r is less than d. We write r = x mod d. Most modern processors represent signed and unsigned integers in the following manner: Unsigned integers as simply…

Science and Technology links (May 26th, 2017)

, 4 min read

Linux, the operating system driving cloud computing and the web, was developed using an open source model. For a time, Linux was seen as a direct competitor to Microsoft, but things have changed and Microsoft is happy to see Linux as just a piece of technology. Because of how large and complicated…

Counting exactly the number of distinct elements: sorted arrays vs. hash sets?

, 3 min read

Suppose that you have ever larger sets of 64-bit integers, and you want to quickly find out how many distinct integers there are. So given {10, 12, 10, 16}, you want an algorithm to output 3, as there are three distinct integers in the set. I choose 64-bit integers, but strings would do fine as…