I was optimizing one of my Monte Carlo simulation programs not so long ago. The program in question was used for simulation of movement of large number of particles under different conditions, but always under assumption that no collisions can occur (large mean free path). Each particle’s movement was therefore totally independent from movement of any other particle. The simulation was already executed in several cores, but I also added another optimization where I used AVX/AVX2 intrinsics. Previously I was simulating the movement of one particle at a time in every thread, but after the optimization I started simulating the movement of four particles simultaneously in every thread. On each step I had to do interpolations, compute results of various equations and so on and optimisation with intrinsics was totally straightforward and it gave a nice (and expected) performance boost. On every step I also had to generate random numbers and I made myself a simple vectorized implementation of xoroshiro128+ and this sped up the simulation nicely, because I could generate 4 pseudorandom numbers at once, i.e. one for every particle. So, to conclude – vectorizing the PRNG turned out to be a good decision in my case.
I’d use the RDRAND machine instruction for everything except Intel messed up their early hardware implementation of it and I get an illegal instruction exception if I try to use it on my laptop.
Actually my main use is evolutionary algorithms and there was a paper that showed that pseudo-random number generator are just as good as “true” random. Also there are on-line sources of “quantum noise” random numbers.
The main reason pseudo-random is okay apparently is that the random numbers just need to be non-correlated with the problem and that is a low hurdle.
I was optimizing one of my Monte Carlo simulation programs not so long ago. The program in question was used for simulation of movement of large number of particles under different conditions, but always under assumption that no collisions can occur (large mean free path). Each particle’s movement was therefore totally independent from movement of any other particle. The simulation was already executed in several cores, but I also added another optimization where I used AVX/AVX2 intrinsics. Previously I was simulating the movement of one particle at a time in every thread, but after the optimization I started simulating the movement of four particles simultaneously in every thread. On each step I had to do interpolations, compute results of various equations and so on and optimisation with intrinsics was totally straightforward and it gave a nice (and expected) performance boost. On every step I also had to generate random numbers and I made myself a simple vectorized implementation of xoroshiro128+ and this sped up the simulation nicely, because I could generate 4 pseudorandom numbers at once, i.e. one for every particle. So, to conclude – vectorizing the PRNG turned out to be a good decision in my case.
I’d use the RDRAND machine instruction for everything except Intel messed up their early hardware implementation of it and I get an illegal instruction exception if I try to use it on my laptop.
Actually my main use is evolutionary algorithms and there was a paper that showed that pseudo-random number generator are just as good as “true” random. Also there are on-line sources of “quantum noise” random numbers.
The main reason pseudo-random is okay apparently is that the random numbers just need to be non-correlated with the problem and that is a low hurdle.