The weblog of Nicholas Chapman Indirection isn't slow, unpredictable indirection isPosted 10 Oct 2015Indirection is an extremely common programming technique. In fact, it's well known that all problems in computer science can be solved with an extra layer of indirection. Accessing data through a layer of indirection has some performance impact. The most important aspect of the performance impact however, is if the indirection is predictable. If the ultimately accessed data is accessed in a coherent order, even through a layer of indirection, performance is still pretty good. Some results of the code below, which tests 2^26 memory reads done in various ways: Data size: 256.000 MB no indirection elapsed: 0.02169 s (12.38 GiB/s, 1.196 cycles/read) sum: 4261412864 predictable indices elapsed: 0.03996 s (6.718 GiB/s, 2.203 cycles/read) sum: 4261412864 unpredictable indices elapsed: 0.9277 s (0.2894 GiB/s, 51.15 cycles/read) sum: 4261412864  As you can see, adding a layer of indirection has some performance impact (~2.2 cycles per read instead of ~1.2 cycles.). However, as long as the accessed data is accessed in a predictable, coherent fashion, then the cache and prefetcher on the CPU means the slowdown is not too bad. If, however, the order of access is not coherent and predictable, performance takes a nose-dive, as the CPU cache and prefetcher cannot effectively fetch memory before it is needed or efficiently when it is need, and massive latency is incurred. This kind of performance degradation is what I expect from unpredictable and incoherent memory access, with or without a layer of indirection. The key point is that it's not indirection per se that is slow, but unpredictable and incoherent memory access patterns that are commonly seen with indirection. `const int N = 1 << 26; // num reads const double cycles_per_sec = 3.7e9; // For my i7 Ivy bridge, running in 'turbo-boost' mode. // Allocate some data, fill with semi-random values. std::vector data(N); for(int i=0; i indices(N); for(int i=0; i indices(N); for(int i=0; i=0; --t) { int k = (int)(rng.unitRandom() * t); mySwap(indices[t], indices[k]); } Timer timer; uint64 sum = 0; for(int i=0; i