Daniel Lemire's blog

Mapping an interval of integers to the whole 64-bit range, fairly?

, 3 min read

In my blog post A fast alternative to the modulo reduction, I described how one might map 64-bit values to an interval of integers (say from 0 to N) with minimal bias and without using an expensive division. All one needs to do is to compute x * N ÷ 264 where ‘÷’ is the integer division. A…

Programming inside a container

, 5 min read

I have a small collection of servers, laptops and desktops. My servers were purchased and configured at different times. By design, they have different hardware and software configurations. I have processors from AMD, Intel, Ampere and Rockchip. I have a wide range of Linux distributions, both old…

Science and Technology links (May 16th 2020)

, 1 min read

Most of our processors, whether in our PCs or mobile phones, are 64-bit processors. In the case of your PC, it has been so for a couple of decades. Unfortunately, we have been stuck with 32-bit operating systems for a long time. Apple has stopped supporting 32-bit applications in the most recent…

Encoding binary in ASCII very fast

, 1 min read

In software, we typically work with binary values. That is, we have arbitrary streams of bytes. To encode these arbitrary stream of bytes in standard formats like email, HTML, XML, JSON, we often need to convert them to a standard format like base64. You can encode and decode base64 very…

Science and Technology links (May 2nd 2020)

, 5 min read

As we age, we tend to produce less of NAD+, an essential chemical compound for our bodies. We can restore youthful levels of NAD+ by using freely available supplements such as NMN. It is believed that such supplements have an anti-aging or rejuvenating effect. Some scientists believe that…

For case-insensitive string comparisons, avoid char-by-char functions

, 1 min read

Sometimes we need to compare strings in a case-insensitive manner. For example, you might want ‘abc’ and ‘ABC’ to be considered. It is a well-defined problem for ASCII strings. In C/C++, there are basically two common approaches. You can do whole string comparisons: bool isequal =…

Sampling efficiently from groups

, 5 min read

Suppose that you have to sample a student at random in a school. However, you cannot go into a classroom and just pick a student. All you are allowed to do is to pick a classroom, and then let the teacher pick a student at random. The student is then removed from the classroom. You may then have to…

Science and Technology links (April 25th 2020)

, 2 min read

People’s muscles tends to become weaker with age, a process called sarcopenia. It appears that eating more fruits and vegetables is associated with far lower risks of sarcopenia. Unfortunately, it is merely an association: it does not mean that if you eat more fruits and vegetables, you are at…

Rounding integers to even, efficiently

, 2 min read

When dividing a numerator n by a divisor d, most programming languages round “down”. It means that 1/2 is 0. Mathematicians will insist that 1/2 and claim that you really are computing floor(1/2). But let me think like a programmer. So 3/2 is 1. If you always want to round up, you can instead…

Science and Technology links (April 11th 2020)

, 3 min read

Greenland sharks reach their sexual maturity when they are 150 years old and they live hundreds of years. Some living sharks today were born in the 16th century. Only 5% of the general population rates themselves as below average in intelligence. And they overestimate the intelligence of their…

Multiplying backward for profit

, 6 min read

Most programming languages have integer types with arithmetic operations like multiplications, additions and so forth. Our main processors support 64-bit integers which means that you can deal with rather large integers. However, you cannot represent everything with a 64-bit integer. What if you…

Science and Technology links (April 4th 2020)

, 1 min read

Antartica was once a rainforest. Google’s DeepMind built artificial intelligences that can defeat human beings at all of the standard Atari (arcade) games. The risk of death and disability after a stroke fell substantially between 2000 and 2015. The U.S. Space Force announced that the Space…

We released simdjson 0.3: the fastest JSON parser in the world is even better!

, 3 min read

Last year (2019), we released the simjson library. It is a C++ library available under a liberal license (Apache) that can parse JSON documents very fast. How fast? We reach and exceed 3 gigabytes per second in many instances. It can also parse millions of small JSON documents per second. The new…

Science and Technology links (March 28th 2020)

, 2 min read

In a laboratory, we know how to turn any of our cells into youthful stem cells using something called the Yamanaka. If you expose cells to such factors for a short time, they appear to remain functional specialized cells but become more youthful. Researchers demonstrated this theory using…

Avoiding cache line overlap by replacing one 256-bit store with two 128-bit stores

, 3 min read

Memory is organized in cache lines, frequently blocks of 64 bytes. On Intel and AMD processors, you can store and load memory in blocks of various sizes, such as 64 bits, 128 bits or 256 bits. In the old days, and on some limited devices today, reading and storing to memory required you to respect…

Number of atoms in the universe versus floating-point values

, 1 min read

It is estimated that there are about 1080 atoms in the universe. The estimate for the total number of electrons is similar. It is a huge number and it far exceeds the maximal value of a single-precision floating-point type in our current computers (which is about 1038). Yet the maximal value that…

Science and Technology links (March 14th 2020)

, 1 min read

Mothers, but not fathers, possess gender-related implicit biases about emotion expression in children. Chinese researchers used to be offered cash rewards for publishing research articles. The Chinese government has banned such rewards. Increasingly, we are collectively realizing that a…

Fast float parsing in practice

, 2 min read

In our work parsing JSON documents as quickly as possible, we found that one of the most challenging problem is to parse numbers. That is, you want to take the string “1.3553e142” and convert it quickly to a double-precision floating-point number. You can use the strtod function from the…