Reputation: 734
I'm using arrays of elements, many of which referencing each other, and I assumed in that case it's more efficient to use pointers.
But in some cases I need to know the index of an element I have the pointer to. For example I have p = &a[i]
and I need to know the value of i
. As I understand it, i
can be computed through p - a
. But this operation inherently involves division, which is expensive, whereas computing an address from an array index involves a multiplication and is faster.
So my question is, is cross referencing with pointers in a case where you need the indexes as well even worth it?
Upvotes: 11
Views: 7008
Reputation: 111
Division isn't required, even when the size is not a power of 2. If the size is odd, then the compiler can replace it with a multiplication with the modular inverse modulo 2^64 (or 2^32 on 32 bit platforms). If it's even, it will first shift out the powers of 2, and then multiply with the modular inverse.
The reason why the compiler can't make this optimization with normal division by a value known at compile time is because it breaks down for numbers which aren't evenly divisible by the right hand operand, but since the compiler may assume that the pointer difference is a multiple of the size of the pointer type, it can make this optimization when subtracting pointers.
For example, this code
#include <cstddef>
struct A
{
char elems[6];
};
ptrdiff_t ptr_diff(A* a, A *b)
{
return b - a;
}
compiles to
ptr_diff(A*, A*):
push rbp
mov rbp, rsp
mov QWORD PTR [rbp-8], rdi
mov QWORD PTR [rbp-16], rsi
mov rax, QWORD PTR [rbp-16]
sub rax, QWORD PTR [rbp-8]
sar rax
mov rdx, rax
movabs rax, -6148914691236517205
imul rax, rdx
pop rbp
ret
Upvotes: 1
Reputation:
Indexing an array is dirt cheap in ways where I've never found any kind of performance boost by directly using pointers instead. That includes some very performance-critical areas like looping through each pixel of an image containing millions of them -- still no measurable performance difference between indices and pointers (though it does make a difference if you can access an image using one sequential loop over two).
I've actually found many opposite cases where turning pointers into 32-bit indices boosted performance after 64-bit hardware started becoming available when there was a need to store a boatload of them.
One of the reasons is obvious: you can take half the space now with 32-bit indices (assuming you don't need more than ~4.3 billion elements). If you're storing a boatload of them and taking half the memory as in the case of a graph data structure like indexed meshes, then typically you end up with fewer cache misses when your links/adjacency data can be stored in half the memory space.
But on a deeper level, using indices allows a lot more options. You can use purely contiguous structures that realloc
to new sizes without worrying about invalidation as dasblinkenlight
points out. The indices will also tend to be more dense (as opposed to sparsely fragmented across the entire 64-bit addressing space), even if you leave holes in the array, allowing for effective compression (delta, frame of reference, etc) if you want to squash down memory usage. You can then also use parallel arrays to associate data to something in parallel without using something much more expensive like a hash table. That includes parallel bitsets which allow you to do things like set intersections in linear time. It also allows for SoA reps (also parallel arrays) which tend to be optimal for sequential access patterns using SIMD.
You get a lot more room to optimize with indices, and I'd consider it mostly just a waste of memory if you keep pointers around on top of indices. The downside to indices for me is mostly just convenience. We have to have access to the array we're indexing on top of the index itself, while the pointer allows you to access the element without having access to its container. It's often more difficult and error-prone to write code and data structures revolving around indices and also harder to debug since we can't see the value of an element through an index. That said, if you accept the extra burden, then often you get more room to optimize with indices, not less.
Upvotes: 13
Reputation: 726479
But this operation inherently involves division, which is expensive, whereas computing an address from an array index involves a multiplication and is faster.
This operation requires a division only when the size of the element is not a power of two, i.e. when it is not a pointer, or some standard type on most systems. Dividing by a power of two is done using bit shifting, which is extremely cheap.
computing an address from an array index involves a multiplication and is faster.
Same logic applies here, except the compiler shifts left instead of shifting right.
is cross referencing with pointers in a case where you need the indexes as well even worth it?
Counting CPU cycles without profiling is a case of premature optimization - a bad thing to consider when you are starting your design.
A more important consideration is that indexes are more robust, because they often survive array reallocation.
Consider an example: let's say you have an array that grows dynamically as you add elements to its back, an index into that array, and a pointer into that array. You add an element to the array, exhausting its capacity, so now it must grow. You call realloc
, and get a new array (or an old array if there was enough extra memory after the "official" end). The pointer that you held is now invalid; the index, however, is still valid.
Upvotes: 21