Daniel Lemire's blog

, 10 min read

JavaScript and fast data structures: some initial experiments

12 thoughts on “JavaScript and fast data structures: some initial experiments”

  1. Chris Nahr says:

    Regarding the benchmarks: Java is slower because you’re putting integers into a generic container. Java must box every single one of those integers with an object wrapper to put them into the container, then unbox them again when reading their values.

    There is still no optimization for this, at least until value types are properly supported in Java 10. For decent speed you must use non-generic containers with hard-coded numerical types, see e.g. http://java-performance.info/hashmap-overview-jdk-fastutil-goldman-sachs-hppc-koloboke-trove-january-2015/

    JavaScript engines are smarter about integers: they use “tagged values” for object instances that directly represent 31-bit integer values if a flag bit is zero. See here for V8: http://thibaultlaurens.github.io/javascript/2013/04/29/how-the-v8-engine-works/

    Try rerunning your benchmarks with strings (always objects) or with large integers or non-integers, those should force boxing in Chrome.

  2. Simon says:

    Could this be optimized for JavaScript running in html5 browsers by using canvas objects to manipulate the bitsets?

  3. @Chris

    The important point is what you just wrote: “JavaScript engines are smart”. 😉

  4. @Simon

    I am not exactly sure how canvas objects could be used to manipulate bitsets. So my answer is: I do not know.

  5. Addam says:

    After looking at your JavaScript loop, try rewriting it like this and see how much your performance increases:

    answer.words = new Unit32Array(answer.count);
    var k = 0
    , nCount = answer.count
    ;

    for (; k < nCount; k++) {
    answer.words[k] = t[k] | o[k];
    }

  6. Francis Kim says:

    Thank you for the great work! I’ve just starred it.

  7. plam says:

    JavaScript doesn’t quite differentiate between ints and floats; that has to be inferred by the runtime system. However, it looks like JavaScript does not support 64-bit integers; anything above 2^53-1 is going to be represented by a float.

    https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER

    v8 has 32-bit ints.

  8. Simon says:

    @Daniel

    I was thinking that an entire bitset could be stored in a canvas. Then two canvases can be composited together to show for example the logical AND of all the bits. Then the bits read our if the canvas.

  9. @Simon

    How fast can you query and set individual bits in a canvas?

  10. @Addam

    As I wrote in my post, the bulk of the running time is spent executing the object allocation :

    answer.words = new Unit32Array(answer.count);

    I can remove all the computations that follow, and the speed won’t increase by more than 50%.

    So what I need is some way to optimize the Unit32Array.

    I can get rid of it and replace it by a simple Array, and the speed improves in this case, but it degrades in other ways. Also, I suspect that Unit32Array uses less space than Array. (Probably half as much.)

  11. Hi Daniel, why the conflation of JavaScript with V8? I’d be curious to see the results in other JavaScript engines, especially Microsoft’s Chakra and Firefox. Chakra is the fastest JS engine according to benchmarks spanning the last year or so (Edge being the browser, though Chakra in IE11 won benchmarks when it was new).

    Firefox and Edge are especially intriguing because of their full support for asm.js, the fast subset of JavaScript that is emitted by Emscripten. It would be fascinating if asm.js made a big difference for these workloads. It offers a richer set of types, including explicit integers, though not 64-bit. Only Firefox and Edge support asm.js and perform AOT compilation for it, though V8 is supposed to benefit from asm.js code to some extent.

    1. > why the conflation of JavaScript with V8?

      I have convenient scripts that run benchmarks automatically through node on my linux boxes. V8 happens to be the engine under Node. It is very convenient to be able to run perfectly reproducible benchmarks, and this is made a lot easier with a carefully configured server box (not a laptop!).

      I invite contributed benchmarks based on other engines.

      > Firefox and Edge are especially intriguing because of their full support for asm.js, the fast subset of JavaScript that is emitted by Emscripten. It would be fascinating if asm.js made a big difference for these workloads. It offers a richer set of types, including explicit integers, though not 64-bit.

      I have played with ams.js but not tested it thoroughly. I have not yet seen anything interesting. To be clear, I am impressed by Emscripten, but not by what I have seen with ams.js so far.

      I am inviting benchmarks.