13th August 2012, 5 min read

On feeding your CPU with data

8 thoughts on “On feeding your CPU with data”

Paul W. Homer says:

August 13, 2012 at 11:40 am

I’m not sure if it is true or not, but mainframes are often attributed with having much bigger pipes (to move data around).

Way back, I was working in C and a friend was in APL on a huge main-frame, we’d write the same cpu intensive algorithms, then compare performance. I never kept track of the numbers, but it was clear that a time-slice off of my friend’s machine was nearly equivalent to the speed of my workstation (which was state-of-the-art then) for smaller jobs, but for big bulk jobs his hardware was often stunningly faster.

Somewhere in the beginning of the OO age, everything became optimized for one-offs, rather than for bulk processing. Usually when I’m optimizing code, the first thing I try is to deal with the data in bulk (followed by memoization) …

Paul.
wn says:

August 13, 2012 at 11:42 am

Are you sure it is an issue of where the data resides and not a scheduling one?
Daniel Lemire says:

August 13, 2012 at 11:57 am

@wn

What do you mean by scheduling? The tests run very fast and I pick the best out of several runs.
KWillets says:

August 13, 2012 at 1:30 pm

RAM is another form of secondary storage, like disk used to be. Cache is now what RAM was conceived to be: a flat memory space with constant access time.
wn says:

August 13, 2012 at 5:50 pm

How long do they run, on which OS, and at what priority? If the test process can be preempted by the OS, which is more likely to happen on longer runs (as with the large data arrays) then you might be measuring the switch contexts of processes without meaning to do so, and would probably want to eliminate that…
Daniel Lemire says:

August 13, 2012 at 5:53 pm

@wn

I prefix them with “nice -n -19” on a Linux box. Moreover, they take only a few seconds to run and involve no IO.
Mike Stiber says:

August 14, 2012 at 8:50 pm

And things are even more complicated if you’re programming a GPU, which has an much more complicated memory architecture. Or, potentially, if you’re doing multithreaded coding on a multicore machine.

Time to relearn computational complexity.
Itman says:

August 15, 2012 at 11:21 am

@Mike,

in regard to GPUs: what is the current main memory to/from GPU memory exchange rate?