Depending on the libraries they are using for Python I strongly suggest looking into PyPy (pretty much 100% compatible with anything pure python; still a bit hit-and-miss for things that go out to native code – they are working on a PyPy compatible NumPy version which is one of the big ones for scientific computing).
Just ran this for a comparison on my computer:
PyPy 4.0.1
$ pypy -m timeit -s ‘import random’ ‘random.randint(0, 1000)’
100000000 loops, best of 3: 0.0117 usec per loop
Python 2.7.10
$ python -m timeit -s ‘import random’ ‘random.randint(0, 1000)’
1000000 loops, best of 3: 0.874 usec per loop
Thanks. I have updated my blog post. Interestingly, you can also apparently simply switch to the randint function provided by numpy.
As far as my colleague is concerned, they are relying on numpy, though not for random-number generation. So PyPy is probably not the solution for them.
Marcel Ballsays:
Yeah – PyPy, and the NumPy version for PyPy, have come a long way. Still parts of NumPy that need to be completed yet – but they’ve been making a lot of progress.
I’ve always found that the Numpy random number generators are very good if you’re able to generate the numbers in advance and then draw from the sample. It is however much slower for generating single numbers.
I’ve just tested on my own machine, and the main limiting factor seems to be the interfacing above the underlying C code, as it generates 10^10 integers in the range (0, 1000] in about 2 microseconds, and 10^4 in roughly the same time.
The slowdown was probably caused by Python folks recently making random() secure. See “Python and crypto-strength random numbers by default” https://lwn.net/Articles/657269/
I can’t say my tests are hard science, I’ve only just started learning, but the random and randint seem to be interacting with the timeit() function in some peculiar way for me.
def test_spam():
foo = 41
for i in range(0, 10):
x = randint(0, 10)
spam(foo)
This took a little over 54 seconds by timeit()’s calculations on my (not really old but slow) computer in powershell. Are there other timer tests to use in Python on the randint? I have only run 24 other small tests, but they all suggest something fishy is going on, and I have no idea if it’s in my computer or the modules.
literally nothing…. I had a piece of code that crashed when I tried to use timeit on it, and I thought I traced it down to randint being used to create a Doubly linked list inside a bubble sort that was being tested. here’s another one… that took 1600 s…..
from random import randint
def test_spam();
foo = 41
for i in range(0, 299):
x = randint(0, 10)
spam(foo)
Given 32-bit uniformly distributed integers, you can generate 24-bit floats that appear at uniform locations within [0,1) by a computation such as (float)(RandomBitGenerator() & 0xffffff) / (float)(1 << 24). Of course, you discard 8 bits. Python floats are 64 bits.
Depending on the libraries they are using for Python I strongly suggest looking into PyPy (pretty much 100% compatible with anything pure python; still a bit hit-and-miss for things that go out to native code – they are working on a PyPy compatible NumPy version which is one of the big ones for scientific computing).
Just ran this for a comparison on my computer:
PyPy 4.0.1
$ pypy -m timeit -s ‘import random’ ‘random.randint(0, 1000)’
100000000 loops, best of 3: 0.0117 usec per loop
Python 2.7.10
$ python -m timeit -s ‘import random’ ‘random.randint(0, 1000)’
1000000 loops, best of 3: 0.874 usec per loop
Thanks. I have updated my blog post. Interestingly, you can also apparently simply switch to the randint function provided by numpy.
As far as my colleague is concerned, they are relying on numpy, though not for random-number generation. So PyPy is probably not the solution for them.
Yeah – PyPy, and the NumPy version for PyPy, have come a long way. Still parts of NumPy that need to be completed yet – but they’ve been making a lot of progress.
I’ve always found that the Numpy random number generators are very good if you’re able to generate the numbers in advance and then draw from the sample. It is however much slower for generating single numbers.
I’ve just tested on my own machine, and the main limiting factor seems to be the interfacing above the underlying C code, as it generates 10^10 integers in the range (0, 1000] in about 2 microseconds, and 10^4 in roughly the same time.
Jikes, it gets worse with a newer python:
# python 2.7
λ python -m timeit -s “import random” “random.randint(0, 1000)”
1000000 loops, best of 3: 1.64 usec per loop
# python 3.5
[dev35] λ python -m timeit -s “import random” “random.randint(0, 1000)”
100000 loops, best of 3: 2.2 usec per loop
Or use numpy if the problem is vectorizeable (one float -> numpy array of floats), but even a single one seems to be faster:
[dev35] λ python -m timeit -s “import numpy” “numpy.random.randint(0, 1000)”
1000000 loops, best of 3: 0.45 usec per loop
# array of 1000 ints:
[dev35] λ python -m timeit -s “import numpy” “numpy.random.randint(0, 1000, 1000)”
100000 loops, best of 3: 10.8 usec per loop
numba might also be a way to speed it up with a single additional line: https://jakevdp.github.io/blog/2015/02/24/optimizing-python-with-numpy-and-numba/
The slowdown was probably caused by Python folks recently making random() secure. See “Python and crypto-strength random numbers by default”
https://lwn.net/Articles/657269/
The problem seems to be present in versions of Python that predate this discussion by a long shot.
On latest Python 2.7 numpy randint is not faster! In fact its 10x slower
python -m timeit -s ‘import fastrand’ ‘fastrand.pcg32bounded(1001)’
10000000 loops, best of 3: 0.0994 usec per loop
python -m timeit -s ‘import numpy’ ‘numpy.random.randint(0, 1000)’
1000000 loops, best of 3: 1.07 usec per loop
I can’t say my tests are hard science, I’ve only just started learning, but the random and randint seem to be interacting with the timeit() function in some peculiar way for me.
def test_spam():
foo = 41
for i in range(0, 10):
x = randint(0, 10)
spam(foo)
This took a little over 54 seconds by timeit()’s calculations on my (not really old but slow) computer in powershell. Are there other timer tests to use in Python on the randint? I have only run 24 other small tests, but they all suggest something fishy is going on, and I have no idea if it’s in my computer or the modules.
I do not understand your code. What does it do?
literally nothing…. I had a piece of code that crashed when I tried to use timeit on it, and I thought I traced it down to randint being used to create a Doubly linked list inside a bubble sort that was being tested. here’s another one… that took 1600 s…..
from random import randint
def test_spam();
foo = 41
for i in range(0, 299):
x = randint(0, 10)
spam(foo)
if __name__ == ‘__main__’:
import timeit
print(timeit.timeit(“test_spam()”, setup=”from __main__
import test_spam”))
that’s it, just trying to see why randint and timeit crashed my original code.
I can’t edit… I forgot the def spam():
def spam(x):
y = x + 1
return y
I’m so sorry…. I’m new. I didn’t know it had a default of 100000 runs…. no wonder!!! It still crashed my command but. sorry.
Can you somehow use this code to draw from a uniform distribution?
The random numbers being generated follow a uniform distribution.
Sorry, I didn’t phrase myself accurately. Can I use this to draw a random number from (0,1)?
Given 32-bit uniformly distributed integers, you can generate 24-bit floats that appear at uniform locations within [0,1) by a computation such as (float)(RandomBitGenerator() & 0xffffff) / (float)(1 << 24). Of course, you discard 8 bits. Python floats are 64 bits.