Daniel Lemire's blog

, 9 min read

Default random-number generators are slow

13 thoughts on “Default random-number generators are slow”

  1. Stefano says:

    Choosing the correct pseudo-random number generator for a Monte Carlo simulation requires a careful analysis, but as a rule of thumb linear congruential generators (LCG) are inadequate. (At least if you are doing computational science, and not video games. This is not because computational science is a more noble endeavour than video game developing: the simulation of physical systems has different requirements than the simulation of a synthetic world. )

    Moreover LCGs are unsuitable for cryptographic applications.

    In my opinion the real question is not whether the default pseudo-random generator is fast enough, but whether it is good enough for your application.

    1. What are your thoughts on PCG?


      I have included PCG in my benchmark but not in my post. It is quite fast.

      1. Stefano says:

        Sound interesting, but I do not have hands on experience. (Disclaimer: my background is in computational science and engineering, but I’ve never run large scale Monte Carlo simulations. Speed never was my main concern… but this depends of the applications in my area of interest.)

  2. Ben says:

    This kind of thing is exactly why I think threads should be used approximately never by applications. Non-parallel abstractions (events, coroutines, coop threads, etc) for multitasking, and pools of worker processes for parallelism! That way be default you never have to worry about the memory nightmares that lead to by-default thread safe RNGs.

    1. Go has also a slow default random number generator. I focused on Java because it is a more popular language… but the problem is not limited to Java…

      1. Ben says:

        Goroutines are just dressed-up threads. The Go community promotes a CSP style of programming with channels, but you can share memory between goroutines and get data races just like you can in Java or C (or almost any other popular language these days).

  3. Stefano says:

    In mono-threaded env. i’ve used XorShift1024* and i’ve tested it with a bloom filter of 800Mil elem Precision 0.000001f and it’s seems real fast and good enough to not create clash: or not?

  4. Viktor Szathmary says:

    You should give ThreadLocalRandom a try 🙂

    1. Excellent. I have updated my blog post and credited you with this important observation.

      It does not change the message of my blog post, but it does point the programmers in a useful direction.

  5. Does the “concurrent random number generator” mean the one which *locks* the seed or the internal state every time during the computation? That might be *very* slow. I would appreciate if you could clarify this. Giving independent seeds for each threads would not cause this locking overhead.

    FYI, Erlang/OTP PRNGs in rand module has three algorithms: Xorshift116+ as default, Xorshift64* and Xorshift1024* are the other choices. Seeds must be given explicitly, or chosen to use the one stored in the process dictionary (a per-process storage area). Note in Erlang processes are roughly equivalent to C or Java threads, but each process has its own dictionary, not sharable with other processes.

    1. Yes. I think that Java locks the seeds, effectively. Other languages like Go do the same.

  6. Jesper Louis Andersen says:

    Another point, which can be important for a PRNG is the ability to split it. That is, support a function `split : rngstate -> rngstate * rngstate` which produces a fork in the RNG seed at that state.

    This allows you to carry out concurrent work but recreate the original state from a seed, since you can use this to handle non-determinism. And it allows you to use the same RNG for a binary tree, but cut off a subtree without having to roll forward the RNG state. You just split at each node.

    Some libraries, notably those who provide tooling for property-based-testing are extremely reliant on such a splitting function in the PRNG they use.

    1. Interesting. Can you elaborate? Which libraries rely on splitting?