Daniel Lemire's blog

, 30 min read

Fast software is a discipline, not a purpose

25 thoughts on “Fast software is a discipline, not a purpose”

  1. Yosef says:

    “Don’t use floating-point operations when integers will do.”

    I don’t see why this is true – while integers are faster to add and subtract by a tiny margin, they are incredibly slower to divide (and modulus). That’s one reason why fixed point numbers are used less than floating point numbers.

    How about – “Choose types according to the operations you want to perform on them”?

    1. I don’t see why this is true – while integers are faster to add and subtract by a tiny margin, they are incredibly slower to divide (and modulus). That’s one reason why fixed point numbers are used less than floating point numbers.

      It is true that 64-bit integers can be slow, if divisions are involved. But if you do need the full 64 bits, it is not immediate that you can just use floats.

      Still, your point is well made.

    2. Jay says:

      I disagree. Mostly in lieu of the last bullet: “Learn how the data is actually represented in bits, and learn to dance with these bits when you need to.”

      Ultimately it really depends on what you’re trying to accomplish. And your compiler, but a good developer shouldn’t expect a compiler to make up for their own inadequacy. Division and multiplication of integers can be computed via bitwise and binary operations instead of the costly counterparts… A well designed algorithm might use bitmasks instead of a modulo and zero-extend right for division. (See gailos field bit manipulation for one example.) Personally, I’d expect the compiler to optimize nearby division/remainder operations into a single instruction but it’s been my experience that unless you’re using the Intel compiler, properly, you’re out of luck.

      While I agree, they are costly, isn’t piping to an accelerator just as costly (even the optimized and on chip extensions). Unless you’re doing lots of floating point calculations, then the comment stands. If you only need one or the other in limited quantities, a bit sieve should suffice.

  2. Ed Singleton says:

    No one believes that slow code is better than fast code. Everyone would make their code fast if it was trivial to do so, so what I think you must be saying is making your code fast is more important than other forms of optimisation.

    Off the top of my head, some things you can optimise code for are:

    – speed (processor time)
    – speed (developer time)
    – memory
    – disk usage (storage)
    – disk usage (fewest lines)
    – readability
    – maintainability
    – clarity
    – security
    – enjoyment / satisfaction
    – cost (a balance of developer time, memory, speed and storage)

    All of these can be optimised at the expense of the others, and I definitely can see no reason why speed of processor time should be at the top of the list except in special cases.

    I suspect that really you are putting enjoyment/satisfaction at the top of the list (which is perfectly reasonable to do), and that you enjoy optimising code for speed.

    Personally I enjoy optimising code for readability/maintainability, which is why I put that near the top of the list (along with developer time and enjoyment). I tend to find that speed conflicts with all of those fairly often.

    I do entirely agree with your broader point that one should have discipline. I personally don’t think discipline in optimising for speed is a particularly good general usage of discipline, neither do I think that personal cleanliness is a very good use of it (it is in fact a very bad use and has been linked to allergies, asthma and weak immune systems).

    1. I enjoy optimising code for readability/maintainability

      I have looked hard and long, and I have never come up with satisfying measures of what it means to be readable in objective terms. Lots of reasonable people object to C because it is harder to read, so they work in C++ instead. Lots of equally reasonable people find old-school C more readable than C++. Many people, maybe most people, love pure functional programming because it is easier to read and more maintainable. There has been lots of formal studies to assess this point, and, frankly, the scientific evidence is lacking. The latest fashionable language is Rust, and people swear by it for its safety, readability, and maintainability. Other, equally reasonable people, find it very hard to work with Rust.

      I could go on.

  3. TheFattestNinja says:

    I agree with the whole “you should care” points, but (pretty much) all the recommendations you posted relate to writing more performing code, not cleaner code.

    Arguably, some of them are at times in conflict with cleaner code, such as “avoid multiple passes” (keeping track of multiple state during a single pass is messier than doing one thing at a time) and “prefer simple value types” (using a String instead of creating a specific strong type for say, a ProductId and a CustomerId).

    It almost seem you (I don’t know your background exactly) come from a fairly low-level of coding background, where there is a generally different sense of what “beautiful” and “good” code are (not judging it wrong or right, just not universal).

    Imho adaptability and readability are far more important and worthy to pursue than performance, but I agree on the overall “at least do care about SOMETHING”! 🙂

    1. “prefer simple value types” (using a String instead of creating a specific strong type for say, a ProductId and a CustomerId).

      I’d argue that your ProductId should be an integer.

      Strings are hard. There are entire books written about just character encodings. There are long debates about how you define what a character is, what is the “string length” and so forth. How does your favorite language handle strings? See my post Are your strings immutable?: there is no widespread agreement on how to consider string values.

      Imho adaptability and readability are far more important and worthy to pursue than performance

      “Adaptability and readability” can mean many things to different people. To some people, it means transforming all your code so it relies on pure functional programming. Then, for others, it means compile-time template metaprogramming. And so forth.

      Adaptability and readability are frequently at odds. Specialized code that does one thing and only one thing is often much more readable than highly customizable code.

    2. It seems to me that we’re talking about nice-to-haves and ideals, i.e. things to optimise if we can afford the effort, and if they don’t impact our other concerns (readability, correctness, etc.).

      In which case, I think your examples are poorly chosen:

      > Arguably, some of them are at times in conflict with cleaner code, such as “avoid multiple passes” (keeping track of multiple state during a single pass is messier than doing one thing at a time)

      There’s a difference between the code we write and the code that runs. *Ideally* we would write each pass as a separate chunk of code, self-contained and unable to conflict with each other, and these would be combined into a single pass by some other chunk of code. In fact, such pass-combining code might turn out to be useful for many other projects 🙂 I say “chunk of code”, since there may be many ways to implement this, e.g. with functions, templates, compiler optimisations, multi-stage programming, etc.

      > and “prefer simple value types” (using a String instead of creating a specific strong type for say, a ProductId and a CustomerId).

      Strong types aren’t in conflict with fast, zero-overhead code. We can use “nominal types” to solve this; these are types where checking only succeeds when the names match. We can define ProductId and CustomerId as nominal types, which are implemented as (say) String. Then, if we use a String or CustomerId when a ProductId is expected, we get a type error, since it’s the name which gets checked, even though they’re all represented by the same bit-patterns in memory. After type checking, the code generator can replace all occurrences of ProductId and CustomerId with String, and compile-away any casts (e.g. `mkCustomerId(myString)`, or whatever) since it knows they’re no-ops. I’m not sure how widespread support for this is, but I know it’s available in Haskell with the `newtype` keyword.

  4. jld says:

    Hummm…
    No, I think the best quality in software is that it should be easy to understand, so a “dumb” linear search is better than any other more clever search if it is good enough given the context.
    And don’t tell me that it couldn’t possibly scale (YANGNI…)

    1. A linear search is often the fastest code you can write.

  5. Martin Pal says:

    I think what you’re really saying is that it is inconsiderate to write code that makes its readers cringe. I agree that if code can be changed to look less offensive on the “this looks stupid slow” front without sacrificing on other dimensions (readability, modularity, maintainability) it’s a win. But everything can be taken too far.

    As an example, a good rule of thumb for keeping things efficient is to pass large objects by reference. This makes a ton of sense if you spend your days writing C++ code for a high qps web service or a data pipeline crunching terabytes of data. At Google, we have automated tools that complain if you pass say an object by value when a const reference would do.

    On the other hand, if you find yourself writing configuration logic for your data pipeline, different rules apply. The configuration logic (I’m talking about Google’s Flume) is run once at the beginning of a multi-hour many-machine job, and no matter how inefficiently you write it, it will finish under a second. For the configuration logic, readability and obvious correctness are traits to optimize for, and nobody should care if that input file name string is copied a few more times than strictly necessary.

    1. As an example, a good rule of thumb for keeping things efficient is to pass large objects by reference. This makes a ton of sense if you spend your days writing C++ code for a high qps web service or a data pipeline crunching terabytes of data. At Google, we have automated tools that complain if you pass say an object by value when a const reference would do.

      The only programming language where I ended up accidentally copying large objects is C++. And it is not always trivial to find out where it happens, unfortunately. And it will rarely end up being a performance concern because copies are really very fast.

      nobody should care if that input file name string is copied a few more times than strictly necessary

      Copying data is fast and it can be faster to work from local copies of the data than to constantly refer back to some functions… with the caveat that caching can be evil too when values can change dynamically.

  6. Simon says:

    FYI I recently answered a question on Quora about writing better and faster code [1]. It’s a longer answer and after you’ve read it then discipline is definitely a word which pops into the mind. Although in my experience developers all have different levels of discipline so IMO it’s important to create a system which enforces discipline where possible, e.g. code cannot beer pushed if it doesn’t meet code coverage requirements, or stress tests do not meet a certain throughout. Even the best coders can relax their discipline late on Friday afternoon 🙂 And this is also a reason why pair programming is useful because you get ‘redundant discipline’ 🙂

    [1] https://www.quora.com/What-are-some-techniques-to-write-better-and-faster-code/answer/Simon-Hardy-Francis

    1. Good post!

  7. Miron Brezuleanu says:

    IMO following your advice would wipe out most PHP software 🙂

    What’s more valuable: bad software that gets the job done or no software?

    I think this is the “PHP question”.

    I remember reading some time ago that one of the bosses on the Microsoft Office team allegedly said that “we could build a bug free Office, but it would cost at least 5kUSD for one copy and nobody would buy it at that price point.”

    I think we should consciously optimize for value while aware of the short/long term compromises, but that’s just my way of keeping my tee-shirt clean 🙂

    1. There is amazing PHP software out there. It started out as a rather naive programming language, but modern-day PHP is quite amazing.

      Could it not be reasonable to think that most of the value in software is produced by few individuals? The salary distribution sure seems to indicate that it is the case. The average programmer is poorly paid whereas others make as much as rock stars.

      What’s more valuable: bad software that gets the job done or no software?

      The value of most software (99%) is zero. You know all this enterprise software that gets pushed onto employees? Yeah. That. We don’t care. It could go away tomorrow and the world would be better off.

      I remember reading some time ago that one of the bosses on the Microsoft Office team allegedly said that “we could build a bug free Office, but it would cost at least 5kUSD for one copy and nobody would buy it at that price point.”

      There was office sofware before Microsoft published the first Windows. Microsoft mopped the floor with its competitors. Their competitors could not keep up. Then the Web came, altavista dominated as the search platform… but they could not update their index because of their crappy software. Google won.

      And so on.

      Crappy, unreliable and slow software kills businesses all the time.

      The Mozilla folks were close to a massive victory in the browser war. But they did not care about performance. They could not adapt to mobile platforms because of their top-heavy approach. People moved to other browsers. Will they survive? I don’t know. If they do, it will be because of a massive reengineering that they just completed.

      1. Simon says:

        PHP is an interesting language because it still is very useful if not the sexy language of the day like Rust is said to be. However, many people don’t realize that PHP is actually a rock star language in Asia and powers many of the giant Asian social media and web sites which are so big that they make our western house-hold names like twitter and Facebook look puny. As such, there is amazing technology like Swoole [1] powered by Asian rock star developers, available on GitHub, but not recognized much in the western world… 🙂

        [1] https://github.com/swoole/swoole-src

        1. Stephan Sokolow says:

          I always point back to https://eev.ee/blog/2012/04/09/php-a-fractal-of-bad-design/ when I hear these kinds of arguments.

          Sure, PHP may have improved, but the kinds of problems shown there are the kind where, even supposing a massive influx of highly-experienced language designers to prevent new mistakes from creeping in, you’re still left with a language with deep flaws which would require a Python 2/3-style split to fix.

          (ie. I see claims that PHP “is better now” as being equivalent to claims by C++ afficionados who dismiss Rust by pointing to all of the proposals to add its features to C++. The “secret sauce” isn’t just in the new stuff that a competitor brings to the mix, but also in the baggage the incumbent is forced to carry to remain compatible with existing code and developer expertise.)

          1. At this point, I don’t think you can make C++ worse by adding new features to it.

      2. Jay says:

        Fun fact, Mozilla stays afloat by selling out via Google and Yahoo.

        As far as I know they only recently stopped selling user statistics to Google. There are also talks of switching from Google’s $300M deal (to make Google the default search engine) to $375M with Yahoo.

        I’m not much of a Mozilla fan but it’s my opinion that the Mozilla Foundation will stay afloat thanks to its frivolous investment alone, regardless of their statistics and user-information-vending side ventures. Much of their API’s wind up as de facto standards long before they ever become the de jure; bolstering their continuity.

        Regarding the Microsoft Office suite… I completely agree because, let’s be honest, does anyone really miss Lotus?

      3. Miron Brezuleanu says:

        Hello again,

        first, I think there is a problem with your comment system: I expected to get an email notification when there was a reply to my comment, but I didn’t get any notification (I did check the spam folder 🙂 ).

        I think we don’t have the same definition for “value”. Sure, for you and me most enterprise value has no value (or even negative value, as in “it seems to do more harm than good”), but for whoever is buying it it may solve problems. Problems we don’t even think about, possibly. Problems they probably wouldn’t have if they had done other things right. But in the context of the buyer, problems they need to solve (and solving them badly is much better than not solving them at all).

        “Value” to you and me probably means that the valuable thing solves big problems for lots of people. To other people, it may simply mean “I get home in time for dinner today”. There is nothing wrong with either view. And yes, individual salaries for adding the first kind of value are larger than individual salaries for adding the second kind. But I also think the aggregate salary for the first kind is less than the aggregate salary for the second kind 🙂

        So while “value” in general is a very complicated term to define, I guess I meant “value to the buyer”.

        So, to clarify, I try to optimize for “value to the buyer”. And yes, that means good performance a lot of times. But not always. Sometimes giving them a good model of their problem and solution is so much more important than the performance of the solution implementation (with a good model they can switch to a better implementation later, if needed). Not all stains on teeshirts are cleaned the same way 🙂

        As for Mozilla, I think they constantly cared for performance (the simple fact that they tried to trim down Mozilla to Phoenix/Firebird/Firefox shows that they cared). They just had a big horrible code base to deal with. I remember at some point they had some projects that detected dead code in a large code base so it could be removed. I guess they had a pretty bad problem with global dead code. I do wonder why they didn’t just rewrite it (they had 15 years to do it). I guess they thought it would end up the same if they started with the same tools (C++ etc.). They are doing a rewrite now in Rust (which they developed because they do care about performance AND maintainability).

        1. Stephan Sokolow says:

          Mozilla didn’t rewrite it because their “killer app” was their extension ecosystem and legacy extensions were based on a design of “There is no API. Monkey patch whatever browser internals you like.”

          That’s what gave them their power to innovate, but also locked the browser into old designs. (Look at any long-running Firefox extension where the author has a history of supporting a wide span of browser versions from each extension release and you’ll see a ton of “If browser version X, do Y” blocks DESPITE the browser hamstringing itself to remain compatible.

          WebExtensions is essentially their admission that the legacy extension APIs (the foundation of their primary advantage over Chrome) had become a gangrenous limb and, rather than amputating it, they kept trying to adapt the old paradigm to a multi-process future for so long that, when they finally admitted they had to amputate, the migration to WebExtensions came across as rushed and it’s still uncertain whether Firefox’s market share can recover.

        2. As for Mozilla, I think they constantly cared for performance … They just had a big horrible code base to deal with. I remember at some point they had some projects that detected dead code in a large code base so it could be removed. I guess they had a pretty bad problem with global dead code.

          I think you could say the same thing about any business that fails due to its poor software. “Oh! They cared, but they had to deal with this big ugly code base”.

          They are doing a rewrite now in Rust (which they developed because they do care about performance AND maintainability).

          You can spin the story in different ways. You could say that they care a lot because they are doing a rewrite. You could also say that they are doing a rewrite because they did not care enough.

          I’d say that the adoption of Rust is definitively a good sign, but I view it as a sign that they are doing a course correction necessary to stay relevant…

    2. Simon says:

      “we could build a bug free Office, but it would cost at least 5kUSD for one copy and nobody would buy it at that price point.”

      I think this is the type of comment is by people who imagine scaling their existing and already badly functioning QA process to an imagined bug free level. Of course it’s going to be expensive.

      And there’s tons of psychology behind an answer like this too. Firstly, it justifies the crappy status quo with the regular testing methods so nobody is losing their jobs regarding their bad decisions so far. And secondly, most developers hate writing tests themselves and instead rely on QA departments and QA developers etc. So the answer is ‘test-hater-developer friendly’ too.

      Also, many companies have neglected tests for so long that there would be a huge and unwanted expense associated with building up the level of code coverage via writing tests. And of course, while you’re writing those tests you’re not writing sexy new features. However, in the past I started work for a company which only had 37% code coverage on it’s code base. I then implemented an automated ‘code amnesty’ whereby only brand new code needed to have 100% code coverage. Leave this to bake for a few years and don’t write any special tests for the code without coverage and bingo… after a few years code coverage is over 90%. This is because writing tests for the new code ends up testing the old code around it a bit too. So there are many ways to achieve high levels of QA without huge expense. But it’s probably not going to work well for those companies who pamper their developers who are too good for writing automated test code…

  8. Ed Vielmetti says:

    “Avoid multiple passes over the data when one would do.”

    This is especially true in this age when the bottleneck is often memory bandwidth.

    I love this account of awk vs Hadoop (spoiler: awk wins):

    https://aadrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html